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Chapter  1 
Introduction 


The  work  on  analytic  prediction  of  emergent  dynamics  undertaken  at  Utah  State  University  has 
been  itself  an  exercise  in  multiagent  coordination.  Three  distinct  perspectives,  drawn  from  the 
Computer  Science  Department,  the  Mathematics  Department,  and  the  Electrical  and  Computer 
Engineering  Department,  were  brought  to  bear  on  problems  of  multagent  coordination. 

•  The  computer  science  perspective  was  closely  tied  to  real-time  scheduling  and  planning 
questions,  leading  to  case-based  negotiation.  In  this  work,  described  in  chapter  2,  au¬ 
tonomous  negotiating  systems  are  composed  of  logically  separated  software  agents  that  con¬ 
trol  resources  that  altruistically  seek  to  perform  useful  work  in  a  cooperative  manner.  The 
work  environment  is  classified  into  resources,  tasks  and  missions.  Each  resource  has  a  pre¬ 
defined  set  of  functionalities  that  define  the  actions  that  the  resources  can  perform,  and  each 
task  requires  one  or  more  functionalities  to  be  applied  to  it  for  a  specific  amount  of  time. 
All  resources  providing  the  requisite  functionalities  must  rendezvous  for  the  duration  of  that 
time  in  order  to  complete  task.  Each  mission  is  composed  of  a  set  of  tasks  and  a  partial 
ordering  among  those  tasks  represented  with  a  directed  acyclic  graph.  Missions,  tasks,  and 
resources  are  represented  by  software  agents.  This  study  examines  the  negotiation  strategy 
between  those  agents,  using  a  negotiation  strategy  that  improves  over  time  by  gained  expe¬ 
rience.  A  case-based  negotiation  strategy  is  presented  that  allows  self-organized  scheduling 
of  the  tasks.  Through  software  simulations,  the  study  shows  that  important  characteristics  of 
system  performance  are  positively  affected  by  such  experience-based  negotiations. 

•  The  mathematics  department  examined  questions  of  task  completion  under  a  general  re¬ 
source  allocation  model,  as  discussed  in  chapter  3.  The  allocation  problem  was  examined 
as  a  nonlinear  differential  equation,  which  was  used  to  predict  completion  ability.  This 
predictive  model  was  then  compared  with  simulation  models.  An  appendix  describes  the 
simulation  software. 

•  The  electrical  engineering  department  examined  the  “praxeic  decision  theory”  approach  to 
multiple  agent  coordination.  Chapter  4  introduces  the  concept,  beginning  with  single  agent 
systems  and  then  extending  to  multiple  agent  systems  in  negotiation.  Inference  —  the  prob¬ 
lem  of  estimating  the  goals  of  other  agents  in  the  arena  —  is  also  discussed.  Another  view- 
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point  toward  multiple  agent  systems  is  also  presented  using  catastrophe  theory.  Two  distinct 
nonlinear  models  for  multiagent  behavior  are  examined.  In  both  cases,  it  is  determined  that 
a  “phase  transition”  behavior  is  to  be  expected.  This  phase  transition  behavior  is  distinct 
from  the  type  of  phase  transitions  from  “easy  problems”  to  “hard  problems”  frequently  dis¬ 
cussed  in  the  multi-agent  literature,  which  is  due  to  systems  with  large  number  of  agents. 
This  has  to  do  with  the  nonlinearity,  which  gives  rise  to  “cusp”  singularities  on  the  manifold 
of  parameter  spaces. 
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Chapter  2 


Organizing  Missions  for  Autonomous 
Resources  Using  Case-Based  Negotiation 

2.1  Introduction 

Autonomous  Negotiating  Systems  are  composed  of  logically  (even  geographically)  separated  soft¬ 
ware  agents  that  control  logical  or  physical  resources  that  altruistically  seek  to  perform  useful  work 
in  a  cooperative  manner. 

This  study  examines  the  negotiation  strategy  between  autonomous  agents,  using  a  negotiation 
strategy  that  improves  over  time  by  gained  experience.  A  case-based  negotiation  strategy  is  pre¬ 
sented  that  allows  self-organized  scheduling  of  tasks  on  distributed  resources.  Through  software 
simulations,  this  study  shows  that  important  characteristics  of  system  performance  are  positively 
affected  by  such  experience-based  negotiations. 

It  is  often  useful  to  classify  the  work  environment  into  resources,  tasks  and  missions.  Generally 
speaking,  tasks  represent  work  to  be  accomplished,  resources  represent  items  used  to  achieve  work, 
and  missions  represent  overall  goals  that  can  be  accomplished  by  the  successful  completion  of  one 
or  more  tasks. 

Each  resource  has  a  predefined  set  of  functionalities  that  define  the  actions  that  resources  can 
perform.  Resources  can  perform  at  most  one  functionality  at  a  time,  and  may  need  a  startup  time, 
fstartup,  (as  in  the  case  of  travel  time  for  physically  distributed  resources,)  before  the  appropriate 
functionality  can  be  applied.  In  this  study,  tstartup  is  assumed  to  be  0. 

Each  task  requires  one  or  more  functionalities  to  be  applied  to  it  for  a  specific  amount  of 
time.  It  is  assumed  that  all  resources  providing  the  requisite  functionalities  must  rendezvous  for 
the  duration  of  that  time  in  order  to  complete  the  task.  Tasks  are  also  ascribed  an  arrival  time, 
tarrivai,  indicating  the  time  a  task  enters  the  system  and  a  need  for  work  to  be  accomplished,  an 
earliest  start  time,  teariiest,  before  which  no  work  can  be  performed,  and  a  deadline,  tdeadUne,  before 
which  all  work  must  be  completed.  Tasks  that  are  not  completed  before  their  deadline  fail,  and  are 
summarily  removed  from  the  system. 

Each  mission  is  composed  of  a  set  of  tasks,  a  partial  ordering  among  those  tasks,  tarrivai,  feariiest, 

and  tdeadline- 
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This  partial  ordering  of  tasks  can  be  represented  with  a  directed  acyclic  graph  (DAG)  where 
the  nodes  of  the  graph  represent  tasks  and  each  arc  (z,  j)  from  node  i  to  node  j  represents  a  time¬ 
ordering  of  i  and  j  (i.e.,  task  i  must  be  completed  before  task  j  can  begin). 


Figure  2.1:  Example  of  partially  ordered  tasks  in  a  mission 


As  an  example,  for  the  mission  represented  in  Figure  1,  task  1  (Tl)  must  be  completed  before 
T2  or  T3  can  begin,  T2  must  be  completed  before  T4  or  T5  can  begin,  and  all  the  other  tasks  must 
be  completed  before  T6  can  begin. 

Each  resource  is  controlled  by  a  resource  agent  that  is  responsible  for  finding  useful  work  for 
that  resource  to  perform.  The  resource  agent  negotiates  with  other  agents  to  arrive  at  a  schedule 
of  work  for  that  resource,  maintains  that  resource’s  schedule  of  work  to  perform,  and  directs  the 
resource  when  to  begin  and  end  work  for  each  task  it  is  scheduled  to  participate  in.  In  this  study 
there  is  a  one-to-one  correlation  between  resources  and  resource  agents,  although  in  general  one 
resource  agent  might  be  responsible  for  many  resources. 

In  the  autonomous  system,  there  is  a  community  of  task  agents.  Each  task  agent  is  responsible 
for  overseeing  the  completion  of  tasks,  including  finding  and  booting  resources  for  that  task  and 
monitoring  the  progress  of  those  tasks.  Task  agents  have  permanence  in  that  they  oversee  the 
completion  of  many  tasks  during  their  life  time.  In  this  study,  each  task  agent  supervises  at  most 
one  task  at  a  time,  from  the  time  the  work  is  first  requested  until  the  time  the  work  is  completed  (or 
the  task  fails).  New  task  agents  are  created  as  needed;  thus  the  number  of  task  agents  in  the  system 
is  equal  to  the  maximum  number  of  known  tasks  at  any  single  moment  in  time.  The  removal  of 
task  agents  from  the  system  is  not  considered  here. 

In  the  same  manner,  each  mission  is  supervised  by  a  mission  agent  that  is  responsible  for 
finding  a  task  agent  for  each  of  its  member  tasks.  Mission  agents  are  also  given  the  responsibility 
for  determining  the  tarrival,  tearliest  and  tdeadUne  for  each  task,  based  on  the  tarrival,  tearliest  and  tdeadUne 
for  the  overall  mission. 

Mission,  task,  and  resource  agents  negotiate  to  determine  acceptable  allocations  of  resources 
to  tasks  extended  in  time.  The  negotiation  strategies  are  founded  on  case-based  negotiation. 

In  case-based  negotiation  [I],  a  case  stores  successful  and  unsuccessful  negotiating  strategies 
gained  from  experience.  Mission  and  task,  each  agent  maintains  a  library  of  cases  that  are  created 
and  refined  as  that  agent  negotiates  with  others. 

Results  of  a  simulation  study  in  which  the  case-based  negotiation  is  compared  to  a  simple 
strategy  that  does  not  rely  on  experience  indicate  that  there  is  a  positive  effect  of  experiential 
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learning  on  the  negotiation  proeess,  and  that  experienee-based  autonomous  seheduling  strategies 
ean  adapt  to  new  environments  without  intervention. 

The  rest  of  the  paper  is  organized  as  follows:  Seetion  2  introduces  current  related  works  by 
other  investigators  in  this  field,  most  specifically  those  examining  case-based  strategies.  Section  3 
focuses  on  the  details  of  negotiation  strategy,  including  the  negotiating  mechanism,  case  definition, 
case  parameters,  and  algorithms.  The  different  simulation  experiments  are  described  in  Section  4. 
Results  from  a  set  of  different  approaches  are  presented  and  analysis  for  each  result  is  given. 
Summarizing  and  concluding  remarks  are  provided  in  Section  5. 


2.2  Background 

Research  into  the  behavior  and  uses  of  software  agents  is  varied  and  widespread.  A  unifying 
theme  is  in  examining  the  potential  for  software  agents  to  exhibit  expertise  through  competition 
or  cooperation.  Agents  may  be  managing  private  resources  as  in  the  case  of  web  agents  [2]  and 
email  highlighting  agents  [3],  among  others.  Some  systems  employ  multiple  agents  [4]  that  adapt 
to  the  current  community  of  agents,  while  other  systems  rely  on  single  agents  [5]  that  ‘travel’  in 
a  distributed  environment,  adapting  to  diverse  conditions  and  providing  functionality  that  would 
otherwise  be  cumbersome,  perhaps  even  infeasible. 

Agents  can  negotiate  using  different  models,  such  as  declarative  descriptions  [6]  that  rely  on 
rule-based  representation  language  to  automate  negotiations  of  business  contracts,  commitments 
[7],  that  capture  the  obligations  from  one  party  to  another,  and  argumentative  negotiation  [8],  which 
is  based  on  values  of  private  information  and  preferences. 

Negotiation  between  agents  can  occur  in  a  single  transaction,  or  can  be  accomplished  in  several 
steps,  as  in  [9],  which  introduces  a  multidimensional,  multi-step  negotiation  mechanism  for  task 
allocations  among  agents. 

As  in  this  study,  the  multi-step  negotiating  strategy  improves  over  time,  while  [9]  improves  by 
constructing  multiple  protocols  that  adapt  to  different  situations. 

Resource  allocation  can  be  determined  by  applying  schema  globally  instead  of  negotiating. 
[10]  presents  the  Marbles  schemes,  a  family  of  cooperative  and  adaptive  algorithms  in  which  all 
the  requirements  and  resource  properties  are  known  a  priori. 

Negotiation  efficiency  can  be  improved  by  calculating  statistics  on  interaction  performance.  [2] 
discusses  the  efficiency  improvement  for  interactions  of  Web  Agent.  By  knowing  the  distribution 
of  access  time,  an  agent  can  optimize  the  access  strategy  (or  negotiation  strategy).  Thus,  statistical 
results  can  be  applied  to  agent  negotiations  to  improve  the  performance.  Whenever  the  access  is 
not  stable  (e.g.,  the  internet  connection  is  interrupted),  this  strategy  is  useful  in  determining  when 
it  is  appropriate  to  renew  access  to  the  previous  site,  or  to  a  new  site. 

Case-based  negotiating  [1]  can  be  applied  to  resource-private  agent  systems,  and  is  a  good 
example  of  how  an  agent  negotiates  for  the  use  of  other  resources  in  order  to  complete  one  or  more 
tasks  promptly.  Negotiations  use  case-based  reasoning  [11]  to  learn,  select,  and  apply  negotiation 
strategies.  Case-based  reasoning  is  used  as  a  basis  for  this  study  and  is  more  fully  described  in 
Section  3. 
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This  paper  differs  from  other  papers  in  focusing  on  the  negotiation  between  task  agents  and 
resources.  The  contribution  of  this  paper  is  in  providing  a  case-based  negotiation  strategy  between 
task  agents  and  resources  to  achieve  a  solution  for  mission  completion,  specifically  in  scenarios 
where  tasks  are  scheduled  for  resources  in  advance  due  to  the  known  ordering  of  tasks  inside  a 
mission.  This  paper  shows  that  case-based  negotiation  can  be  beneficial.  A  defining  element  that 
distinguishes  this  study  from  others  [12,  13,  7,  6]  is  the  juxtaposition  of  autonomy,  deadlines,  and 
the  focus  on  systems  that  are  loaded  to  the  point  of  task  failure  as  a  result  of  missed  deadlines. 


2.3  Case-based  Negotiating 

An  argumentative  negotiation  is  adopted  in  this  study  where  task  agents  negotiate  with  resources 
by  presenting  one  or  more  arguments  to  convince  resource  agents  to  allocate  their  resources  to  the 
task.  An  argument  is  an  expression  indicating  the  value  of  a  feature  of  an  agent,  in  a  form  of: 
[feature^  jcomparison  operator^  jvalue^  (e-g-,  priority  =  high,  negotiation  time  =  10,  etc.). 

Several  primitives  are  defined  as  negotiation  messages  (e.g.,  ‘require’,  ‘accept’,  ‘decline’,  etc.), 
each  of  which  has  its  own  parameters.  In  general,  task  agents  request  resource  agents  to  fulfill 
their  functionalities  by  supplying  a  list  of  arguments  in  order  to  convince  the  resource  agents.  The 
resource  agent  then  evaluates  those  arguments  and  replies  to  the  task  agent  acceptance  or  rejection 
of  the  task  agent’s  request.  A  resource  agent  may  make  a  counteroffer  in  the  form  of  its  own 
argument.  The  task  agent  may  then  accept,  decline  or  make  a  counteroffer  again  until  both  sides 
make  an  agreement. 

Case-based  negotiation  is  an  application  of  case-based  reasoning  (CBR)  [11].  Instead  of  giving 
a  diagnosis  or  solution  to  a  problem,  this  study  uses  the  diagnosis  or  solution  as  the  current  nego¬ 
tiation  strategy.  [1]  presents  a  case-based  negotiation.  [1]  uses  CBR  to  select,  apply,  and  learn  the 
negotiation  strategies  that  the  agent  uses. 

In  [I],  the  negotiation  of  agents  is  targeted  at  requesting  resources  from  other  cooperative 
agents.  Each  resource  is  local  to  an  individual  agent.  An  agent  must  negotiate  with  others  in 
the  cooperation  of  resource  uses  to  complete  tasks.  An  agent  uses  different  negotiating  strategies 
at  different  instants,  because  the  current  agent  status  or  the  global  environment  variables  differ 
dynamically.  Each  agent  stores  those  different  negotiation  strategies  by  cases,  which  store  the 
negotiation  parameters  (or  strategies)  under  certain  of  environment. 

This  study  is  based  on  distributed  (or  shared)  resources.  It  is  targeted  at  presenting  a  solution  on 
resource  scheduling  and  allocation  in  a  multitask,  multi-resource  and  soft  real-time  environment, 
where  the  partial  ordering  of  tasks  are  known  and  any  task  may  fail  due  to  the  lack  of  competition. 
The  structure  of  each  mission  (i.e.,  tasks  and  needed  functionalities)  is  assumed  to  be  known  in 
advance.  The  dynamic  creation  or  redefinition  is  not  considered  here.  This  study  does  not  examine 
the  real-time  case  where  the  negotiation  is  limited  by  time,  although  the  focus  here  is  in  finding 
allocations  of  resources  quickly. 
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2.3.1  Case  Library 

Each  task  agent  maintains  a  case  library.  Each  case  retains  environment  descriptors  and  negotiation 
parameters  from  previous  negotiation  transactions  that  may  be  used  in  the  current  negotiation  if 
current  environment  is  closely  related  to  the  case  environment. 

Each  case  is  composed  of  two  parts.  The  first  part  is  a  vector  of  descriptors  of  that  case’s 
environment.  Each  descriptor  describes  the  value  of  a  feature  of  the  case’s  environment.  Eor 
example,  negotiation  time  for  current  task.  The  second  part  of  the  case  is  the  set  of  negotiating 
parameters,  such  as  priority,  number  of  unfilled  functionalities,  etc. 

In  this  study,  the  environment  descriptor  portion  of  a  case  is  composed  of  three  parts:  system 
information,  self  information,  and  resource  information. 

System  information  includes  the  ratio  of  the  number  of  required  functionalities  by  active  tasks 
versus  the  number  of  supplied  functionalities  by  available  resources.  This  parameter  is  based  on 
data  that  is  perceived  by  each  individual  agent,  gained  by  monitoring  requests  and  responses  by 
task  and  resource  agents  over  the  communication  medium.  This  is  useful  for  determining  how 
competitive  it  is  for  current  tasks  to  request  resources.  Additionally,  system  information  includes 
the  ratio  of  average  negotiation  time  for  active  tasks  to  the  maximum  negotiation  time  perceived 
by  that  agent.  This  parameter  is  intended  to  measure  to  what  extent  active  tasks  can  alleviate  the 
load  of  system. 

Self  information  of  the  task  agent  is  composed  of  two  elements.  The  first  element  is  the  ratio 
of  self-negotiation  time  to  the  maximum  negotiation  time,  as  a  measure  of  past  negotiating  perfor¬ 
mance;  and  the  second  element  is  the  ratio  of  the  number  of  required  functionalities  of  this  task  to 
the  possible  maximum  number  of  functionalities  known  to  be  required  by  any  task.  This  variable 
measures  the  relative  difficulty  of  the  task  to  fulfill  its  functionalities. 

Resource  information  for  this  study  adopts  one  variable:  the  ratio  of  the  number  of  free  re¬ 
sources  to  the  total  number  of  resources  perceived  by  any  agent  under  the  rationale  that,  if  more 
resources  exist,  tasks  are  easier  to  be  fulfilled. 

The  set  of  negotiating  parameters  for  each  case  in  this  study  is  composed  of  four  elements: 
priority,  time  required,  the  number  of  unfilled  functionalities  and  the  ratio  of  teariiest  -  tarrivai  to 
tdeadiine  “  tarrivai,  which  indicates  the  percentage  of  total  time  before  the  deadline  that  can  be  used 
for  negotiation.  The  priority,  time  required,  and  the  number  of  unfilled  functions  for  each  task 
are  nonnegotiable.  The  negotiation  time  between  tasks  and  resources  is  adjustable  by  each  task. 
Tasks  in  general  seek  a  low  value  for  negotiation  time,  because  tasks  desire  to  complete  as  soon  as 
possible. 

2.3.2  Case  Selection  and  Retrieval 

The  case-based  negotiating  strategy  in  general  evaluates  cases  using  weighted  matching,  and  em¬ 
ploying  different  matching  functions  for  different  features.  Eor  example,  the  environment  de¬ 
scriptors  in  Eigure  2  have  two  features:  A  and  B.  Each  feature  is  assigned  a  similarity  function 
Similarity!  (i,j)  that  calculates  the  similarity  of  two  values  of  this  feature.  The  overall  similarity  of 
any  two  environments  is  calculated  by  weighted  sum  of  each  feature  similarities.  After  evaluation, 
the  most  similar  case  (i.e.,  the  one  with  maximum  similarity  result)  will  be  selected. 
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After  the  most  similar  case  is  selected,  the  strategy  from  this  case  is  used  to  control  the  negoti¬ 
ation. 


Figure  2.2:  An  example  of  similarity  comparison 


2.3.3  Case  Storage  and  Learning 

If  the  negotiation  fails,  the  task  agent  refines  the  negotiation  strategy  in  a  ‘conservative’  way  (e.g., 
as  in  this  study,  increasing  the  percentage  of  time  before  deadline  that  is  allocated  to  negotiation). 
Conversely,  if  the  negotiation  succeeds,  the  task  agent  refines  the  negotiation  strategy  in  a  more 
‘adventurous’  way  (e.g.,  as  in  this  study,  decreasing  the  percentage  of  time  allocated  to  negotia¬ 
tion).  After  the  refinement,  the  new  case  is  stored  in  the  case  library. 

Before  storing  a  new  case,  it  is  compared  to  the  most  similar  case  that  already  exists  in  the  case 
library.  If  they  are  similar  within  a  threshold,  the  new  case  is  discarded  so  that  the  case  library 
remains  a  stable  size.  The  strategy  in  the  new  case  will  compromise  with  that  in  the  similar  case. 
For  example,  the  compromise  can  be  taken  by  averaging  the  two  values  of  negotiating  parameters. 
Some  new  cases  are  identified  as  ‘irrational’,  and  are  discarded  anyway,  such  as  a  case  with  more 
insufficient  resources  environment,  ask  for  less  negotiating  time,  comparing  to  the  strategy  in  the 
similar  case. 

2.3.4  Task  Algorithm 

Each  task  agent  negotiates  with  available  resources  until  all  of  its  functionalities  have  been  filled. 
A  task  can  find  all  the  possible  resources  by  collecting  responses  from  available  resources  after  it 
broadcasts  a  ‘request’  message  to  all  resources.  After  the  task  chooses  a  resource  and  sends  out 
the  request  to  this  resource,  it  may  get  an  ‘accept’  or  a  ‘decline’  message  from  the  resource.  The 
following  is  a  formalized  algorithm  on  task  side: 

Li.  If  all  the  functionalities  have  been  filled,  go  to  Le.  Otherwise,  get  the  next  unfilled  func¬ 
tionality. 
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L2.  Broadcast  the  request  for  the  current  functionality,  and  set  up  a  set  of  resource  agents  as 
potential  negotiators  by  monitoring  responses  from  resource  agents, 

L3.  Choose  next  resource  agent  from  the  set  of  resource  agents,  determine  the  current  environ¬ 
ment,  seek  the  case  library,  and  try  to  find  the  matching  case. 

If  the  matching  case  is  not  found,  create  a  new  case  with  default  arguments.  Prepare  negotiation 
with  the  current  resource  agent. 

If  the  matching  case  is  found,  fetch  the  arguments  from  the  case.  Prepare  negotiation  with  the 
current  resource  agent, 

L4.  Send  a  request  to  the  current  resource  agent, 

L5.  Wait  until  one  of  followings  happens: 

If  the  task  runs  out  of  negotiation  time,  release  all  the  resources  previously  scheduled,  refine 
the  negotiation  strategy,  and  store  this  case  into  the  case  library.  Go  to  L7, 

If  the  task  receives  an  ‘accept’  message,  put  this  resource  into  a  scheduled  resource  list,  and  go 
to  Li, 

If  the  task  receives  a  ‘decline’  message,  go  to  L3  if  the  counteroffer  is  impossible,  or  go  to  L4 
if  the  counteroffer  is  adopted, 

Lg.  The  task  starts  executing.  After  it  finishes,  refine  the  negotiation  strategy,  store  this  new 
case  into  the  case  library  and  release  all  resources  previously  occupied, 

L7.  Save  the  statistical  data  and  exit. 

2.3.5  Resource  Algorithm 

Each  resource  has  3  states:  idle,  scheduled  and  active.  Scheduled  resources  can  be  grabbed  (i.e., 
allocated  to)  by  other  higher  priority  tasks,  but  active  resources  cannot  be  grabbed  by  any  task,  idle 
resources  can  of  course  be  grabbed  by  any  task.  Upon  a  request  from  a  task,  resources  evaluate 
the  arguments  passed  by  the  negotiation  message,  and  make  a  decision  based  on  the  result  from 
the  utility  function  that  is  used  to  evaluate  the  importance  of  a  task.  The  utility  function  of  each 
resource  maintains  a  threshold,  which  measures  how  strong  (or  how  important)  the  arguments 
are.  Decision  is  made  by  comparison  between  the  result  from  the  utility  function  and  the  current 
threshold.  A  formalized  algorithm  on  the  resource  side  is  as  follows: 

Li.  Wait  until  one  of  followings  happens: 

If  a  ‘request’  message  is  received  from  any  task,  go  to  L2, 

If  an  ‘activate’  message  is  received,  change  current  state  to  ‘active’.  Go  to  Li, 

If  a  ‘release’  message  is  received,  go  to  L4, 

L2.  Calculate  the  result  from  the  utility  function  parameterized  by  the  arguments  from  the 
negotiation, 

L3.  Identify  current  state. 

If  current  state  is  ‘idle’,  send  ‘accept’  message  back,  change  the  state  to  ‘scheduled’,  set  up  a 
new  threshold  by  the  result,  and  remember  the  scheduled  task  and  its  functionality.  Go  to  Li, 

If  current  state  is  ‘scheduled’,  task  competes  for  the  resource.  If  the  result  is  higher  than 
the  current  threshold,  the  resource  will  send  ‘accept’  message  back  while  informing  the  already 
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scheduled  task  of  ‘lost  resources’.  Change  currently  scheduled  task  to  this  new  task  and  renew  the 
functionality.  Go  to  Li;  otherwise,  send  ‘decline’  message  back,  and  go  to  Li, 

If  current  state  is  ‘active’,  send  ‘decline’  message  back  with  a  counteroffer,  which  indicates  the 
left  time  the  resource  keeps  ‘active’.  Go  to  Li, 

L4.  Change  current  threshold  to  0,  mark  currently  scheduled  task  and  functionality  as  null,  and 
reset  current  state  as  ‘idle’.  Go  to  Li. 


2.4  Experiments 

Figure  3  shows  the  experimental  model.  A  task  distributor  generates  and  distributes  new  tasks, 
simulating  the  pattern  of  partially-ordered  tasks  in  the  missions.  Task  agents  accept  tasks  from  the 
task  distributor.  Each  task  agent  accepts  at  most  one  task  at  a  time.  Each  task  agent  dispatches 
a  thread  for  each  task  and  negotiates  with  resources  to  fulfill  all  functionalities  required  by  this 
task.  Task  agents  are  responsible  for  the  maintenance  of  their  case  libraries,  including  new  case 
insertion,  old  case  refinement,  and  case  removal.  Negotiation  occurs  between  task  agents  and 
resource  agents  (e.g.,  in  Eigure  3,  there  are  20  resource  (or  resource  agents)  available).  The  result 
of  each  task  is  recorded  after  it  completes  (or  fails). 


T1  Complete!  ... 
T2  Complete!  ... 
T3  Failed!  ... 

T4  Complete!  ... 


Negotiations 


Eigure  2.3:  Experimental  Model 


Before  conducting  the  experiments,  some  predetermined  values  for  some  experimental  parame¬ 
ters  are  required  (e.g.,  the  number  of  tasks,  number  of  resources  etc.).  The  following  predetermined 
parameters  remain  unchanged  in  this  study. 

Number  of  resources=20,  a  parameter  describes  how  many  resources  are  available  in  the  sys¬ 
tem. 
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Maximum  number  of  functionalities=8,  a  parameter  indicating  the  maximum  number  of  differ¬ 
ent  functionalities  that  can  be  set  up  inside  a  task  or  a  resource. 

Maximum  number  of  functionalities  per  task  or  resource=3.  This  parameter  is  used  to  set  an 
upper  bound  for  the  number  of  functionalities  that  can  be  required  by  any  task  or  be  offered  by  any 
resource. 

Levels  of  priority=3.  Priority  is  a  parameter  used  in  the  case-based  negotiation.  Each  task  has 
a  predetermined  value  for  its  priority  (e.g.,  ‘high’,  ‘medium’  or  ‘low’). 

Maximum  negotiation  time=20  (time  units).  This  parameter  is  adjustable  by  each  task,  and 
resource  can  make  counteroffer  to  a  task  on  this  parameter. 

Maximum  running  time=20.  This  parameter  indicates  the  maximum  time  the  task  can  run  after 
all  the  functionalities  allocated  to  this  task  are  ‘active’  on  it. 

Task  sample  collection  rate=10.  This  task  calculation  interval  indicates  the  number  of  tasks 
that  complete  before  the  next  graph  point  is  calculated. 

Based  on  the  above  parameters,  a  simple  program  is  used  to  produce  a  random  set  of  resources 
and  tasks  so  that  the  internal  functionalities  offered  by  each  resource  and  required  by  each  task  are 
randomly  distributed. 

Experiments  are  conducted  to  observe  (1)  average  task  completion  time  under  case-based  ne¬ 
gotiation  with  varying  task  arrival  rates  at  different  times.  The  goal  is  to  examine  the  impact  that 
cases  put  on  the  task  completion  over  time.  With  a  variance  on  task  arrival  rate,  another  group 
of  cases  adapting  to  the  changing  environment  are  expected.  The  experiments  also  observe  the 
learning  rate  of  cases  under  different  refinement  strategies;  (2)  average  task  completion  time  un¬ 
der  simple  negotiation  strategy  with  different  percentage  of  negotiation  time  that  remains  constant 
in  simple  negotiation.  Experiments  are  expected  to  examine  the  performance  on  task  completion 
under  different  percentage  of  negotiation  time;  (3)  average  completion  time  with  different  case 
granularity  in  case-based  negotiation. 

By  changing  task-distributing  intensity,  the  volume  of  task  stream  (or  workload  of  system)  can 
be  adjusted. 

(1)  Eigure  4  shows  an  average  task  completion  time  under  case-based  negotiation  with  dif¬ 
ferent  refinement  strategies.  There  are  two  refinement  strategies  in  these  experiments:  one  is  an 
‘aggressive’  refinement  strategy,  the  other  is  a  ‘conservative’  refinement  strategy.  The  ‘aggressive’ 
refinement  strategy  increases  the  percentage  of  negotiation  time  by  0.2  if  each  task  fails,  and  keeps 
the  original  percentage  of  negotiation  time  if  each  task  completes.  The  ‘conservative’  refinement 
strategy  increases  the  percentage  of  negotiation  time  by  0.05  if  each  task  fails,  and  decreases  the 
percentage  of  negotiation  time  by  0.01  if  each  task  completes.  As  shown  in  Eigure  4,  task  arrival 
rate  changes  from  0.2  per  time  unit  to  1  per  time  unit  at  the  lOO*'^  time  cycle. 

Under  each  of  refinement  strategies,  there  are  initially  no  cases  in  the  case  library.  Because  the 
negotiation  time  is  initially  a  small  value,  there  is  a  significant  possibility  for  each  task  to  fail  at 
the  starting  phase  of  negotiation.  As  a  result,  the  average  completion  time  is  low  at  the  beginning. 
After  some  time,  cases  become  adapted  to  the  current  environment  and  negotiation  time  (i.e.,  the 
amount  of  time  in  which  agents  are  allowed  to  negotiate  before  resources  become  active  on  that 
task)  increases.  Average  task  completion  time  increases  as  a  result  of  that  more  tasks  complete  (or 
fewer  tasks  fail).  After  the  task  arrival  rate  changes  from  0.2  per  time  units  to  1  per  time  unit  (at  the 
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Figure  2.4:  Average  task  eompletion  time  in  ease-based  negotiation 


100**^  time  eyele),  there  is  a  signifieant  drop  in  average  task  eompletion  time  beeause  there  are  no 
sueh  oases  suitable  for  the  eurrent  new  environment.  Agents  begin  making  ‘oonservative’  deoisions 
by  limiting  negotiation  time  to  provide  a  ‘safety  net’  of  additional  time  in  whioh  to  oomplete  a  task. 
After  some  time,  new  oases  are  set  up  that  are  adapted  to  new  environments,  and  the  average  task 
eompletion  time  improves. 

Comparing  two  different  refinement  strategies.  Figure  4  demonstrates  that  the  ‘aggressive’  re¬ 
finement  strategy  brings  a  faster  learning  rate  of  eases  (or  the  rate  of  adapting  to  new  environments) 
than  the  ‘oonservative’  refinement  strategy. 

(2)  Under  simple  negotiation,  the  peroentage  of  negotiation  time  is  eonstant,  and  eannot  be 
ehanged  during  the  negotiation.  Figure  5  demonstrates  average  task  eompletion  time  in  different 
pereentage  of  negotiation  time.  Task  arrival  rate  ohanges  from  0.2  per  time  unit  to  1  per  time  unit 
at  100*^  time  eyele. 

Figure  5  indioates  that,  in  a  simple  negotiation,  average  task  eompletion  time  keeps  in  a  oon- 
stant  range  unless  the  task  arrival  rate  ohanges.  After  task  arrival  rate  ohanges  from  a  ‘slow’  arrival 
rate  to  a  ‘fast’  arrival  rate  (i.e.,  ehanges  at  100**^  time  eyele  in  Figure  5),  the  average  task  eompletion 
time  inoreases  only  if  the  pereentage  of  negotiation  time  (pereentage  of  negotiation  time  =  0.5  as  in 
Figure  5)  still  aeeommodates  eurrent  heavier  workload.  Conversely,  the  average  task  eompletion 
time  deereases  if  eurrent  negotiation  time  (peroentage  of  negotiation  time  =  0.33  or  0.25  in  Figure 
5)  eannot  aooommodate  eurrent  workload.  Based  on  these  phenomena,  a  eonstant  pereentage  of 
negotiation  time  has  a  possibility  to  fail  due  to  laek  of  adapting  to  varied  environments. 

Case-based  negotiation  is  able  to  adjust  negotiation  parameters  to  adapt  to  new  environments. 
Therefore,  ease-based  negotiation  shows  a  positive  effeet  eomparing  to  the  simple  negotiation. 
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Figure  2.5:  Average  task  eompletion  time  in  simple  negotiation 


Comparing  the  average  task  eompletion  time  in  the  simple  negotiation,  case-based  negotiation  can 
also  hit  a  high  task  completion  time  by  speeding  up  the  case-learning  rate. 

(3)  By  changing  the  threshold  of  similarity  function  that  determines  the  difference  of  cases, 
case  library  can  have  different  case  granularity.  Figure  6  indicates  that  different  granularity  causes 
different  learning  speed  of  new  cases  so  that  it  will  takes  longer  or  sooner  for  the  cases  to  adapt  to 
new  environments.  As  shown  in  Figure  6,  in  a  low  task  arrival  rate,  granularity  doesn’t  significantly 
affect  the  completion  time,  because  each  case  is  not  sensitive  under  a  low  workload  environment. 
Conversely,  in  a  high  task  arrival  rate  environment,  the  average  task  completion  time  with  low 
granularity  cases  is  higher  than  that  with  high  granularity  cases. 


2.5  Summary 

As  a  characteristic  of  negotiation  performance,  task  average  completion  time  has  been  prolonged 
after  cases  have  been  learned.  The  jagged  curve  shows  a  gradually  stabilized  completion  time 
accompanying  with  the  learning  of  new  cases  or  the  refining  of  old  cases.  The  simple  negotiation 
keeps  a  relatively  constant  completion  time,  instead  of  showing  an  improved  or  gradually  stabilized 
curve.  These  results  also  demonstrate  that,  whenever  an  extreme  change  happens  in  the  system, 
new  cases  are  created  and  another  round  of  case  learning  will  initiate  and  become  stabilized  after 
a  period  of  time  in  case-based  negotiation. 

Results  from  different  case  refinement  strategy  and  granularity  on  case  learning  indicate  that  the 
different  rate  of  stabilization  that  occurs  in  average  task  completion  time.  Aggressive  refinement 


13 


Figure  2.6:  Average  task  eompletion  time  in  different  granularity  eases 


strategy  and  low  granularity  cases  make  the  average  completion  time  become  stabilized  faster 
than  conservative  refinement  strategy  and  high  granularity  cases,  because  aggressive  refinement 
strategies  make  cases  adapt  new  environments  faster,  and  low  granularity  alleviate  the  sensitivity 
of  environmental  changes. 

Case-based  negotiation  shows  a  significant  benefit  on  adapting  different  environments.  But  in 
the  simple  negotiation,  constant  negotiation  parameters  take  risks  to  bring  a  low  task  completion 
time  due  to  inability  in  adapting  to  new  environments.  Case-based  negotiation  is  able  to  bring  a 
high  task  completion  time  by  speeding  up  adapting  rate  comparing  to  the  benefits  some  ‘generous’ 
parameters  create  in  the  simple  negotiation. 
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Chapter  3 

Agent-based  Task  Completion 


3.1  Predicting  Agent-based  Task  Completion 

3.1.1  Summary  of  Results 

This  chapter  presents  a  model  for  solving  a  resource  allocation  problem  (the  ‘Screaming  Generals’ 
problem)  in  which  autonomous  agents  negotiate  for  use  of  the  resources.  The  Screaming  Generals 
problem  is  a  test-bed  for  our  ideas  about  task  completion  in  a  multi- agent  environment  with  hard 
deadlines.  Rather  than  analyze  some  specific  negotiation  scheme,  we  present  a  model  that  accepts 
the  results  of  negotiation  as  input.  Our  characterization  of  the  results  of  negotiation  is  based  on  the 
priorities  built  in  to  the  negotiation  scheme  -  for  example,  some  negotiations  commonly  result  in  an 
approximately  equal  distribution  of  resources.  Three  different  negotiation  strategies  are  presented, 
and  while  these  are  by  no  means  exhaustive,  our  framework  easily  accommodates  the  addition  of 
new  strategies.  We  then  analyze  how  well  the  agents  complete  tasks  under  different  negotiation 
inputs,  using  both  analytical  and  numerical  techniques.  Our  numerical  techniques  allow  us  to 
determine  the  regimes  in  which  given  negotiation  strategies  are  superior  to  others,  and  to  estimate 
the  asymptotic  rates  of  task  completion  as  the  number  of  tasks  increases. 

3.1.2  Introduction 

In  the  context  of  computer  science,  an  agent  is  some  entity  (whether  virtual  or  physical)  that  has 
control  over  its  own  actions.  A  variety  of  applications  have  been  found  for  agents,  some  involv¬ 
ing  searching  or  bidding  over  networks  (such  as  the  internet).  The  Defense  Advanced  Research 
Projects  Administration  (hereafter  DARPA)  is  interested  in  using  them  to  replace  conventional  hu¬ 
man  and  computer  resources  in  applications  such  as  logistics,  reconnaissance,  and  combat.  Agents 
have  the  advantage  of  being  able  to  make  decisions  on  their  own  while  still  being  able  to  commu¬ 
nicate  with  other  agents.  However,  if  the  physical  housing  of  the  agent  is  damaged  or  destroyed, 
other  agents  are  not  dependent  on  the  missing  agent,  and  no  lives  are  lost.  Our  research  mandate 
from  DARPA  was  to  begin  analyzing  systems  of  agents  to  observe  how  they  perform. 

Consider  a  set  of  tasks  which  require  the  use  of  some  set  of  resources  for  their  completion  or 
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performance.  Some  examples  include  assigning  CPU  cycles  in  a  Beowulf  cluster,  radar  emitters 
in  a  naval  fleet,  or  more  basically  slots  in  a  schedule.  A  classical  approach  would  be  to  decide  on 
a  distribution  of  resources  that  would  allow  the  tasks  (or  some  portion  of  them)  to  be  completed. 
While  this  method  has  its  merits  and  has  been  widely  studied  {e.g.  scheduling,  linear  program¬ 
ming),  it  is  a  centralized  approach  -  a  unique  solution  is  determined  and  resources  are  allocated 
accordingly. 

The  centralized  method  also  presupposes  that  the  entity  determining  the  solution  has  control 
of  the  resources  as  well  as  responsibility  for  scheduling  appropriate  allocations  of  those  resources. 
For  our  purposes  an  agent  is  similar  to  this  entity,  except  that  it  only  controls  some  subset  (possibly 
empty)  of  the  resources.  A  collection  of  these  agents  forms  an  autonomous  system,  in  which  the 
agents  can  negotiate  with  one  another  for  use  of  the  resources.  Completion  of  tasks  depends  on 
the  behavior  of  the  agents.  If  the  negotiation  occurs  in  a  time-critical  environment  it  introduces  an 
interesting  trade-off  between  negotiation  and  task  completion.  Even  under  the  assumption  that  an 
agent  can  ‘talk’  and  ‘work’  at  the  same  time,  time  spent  negotiating  can  still  produce  a  delay  in 
reallocating  the  resources  to  adapt  to  a  change  in  circumstances. 

This  is  a  general  description  of  the  agent-based  approach  which  could  be  adapted  to  a  wide 
variety  of  problems.  For  purposes  of  this  paper  we  will  discuss  a  narrower  regime  in  which  the  de¬ 
tails  of  the  negotiation  are  suppressed.  Regardless  of  how  the  agents  actually  conduct  negotiations, 
they  will  arrange  for  some  distribution  of  resources  (presumably  in  finite  time).  As  an  example, 
consider  a  simple  bazaar  system.  Initially  agents  are  assigned  a  certain  amount  of  money,  which 
may  depend  on  the  importance  of  their  task,  its  degree  of  completion,  and  its  proximity  to  deadline. 
The  agents  then  bid  on  a  large  supply  of  homogeneous  widgets.  Agents  controlling  widgets  make 
counter-offers,  and  in  general  a  price  is  agreed  upon  (after  negotiation)  that  is  somewhere  in  the 
middle.  After  all  the  money  is  spent,  each  bidder  will  have  a  number  of  widgets  proportional  to  the 
amount  of  money  it  was  given  initially.  We  introduce  the  concept  of  resource  allocation  strategies 
to  describe  this  end-result.  Thus  we  only  consider  two  aspects  which  result  from  negotiation  -  the 
final  allocation  and  the  time  spent  reaching  that  allocation.  Perhaps  the  agents  have  the  goal  of 
hammering  out  a  fair  share  of  the  resources  for  each  agent.  Conceivably  there  are  many  ways  to 
do  this,  but  as  far  as  the  completion  of  tasks  is  concerned  all  that  matters  is  how  long  it  takes  to 
achieve  the  fair  division.  Any  other  goal,  such  as  completing  smaller  tasks  or  critical  tasks  first,  can 
be  accommodated  by  these  strategies.  Our  goal  in  this  paper  is  to  develop  a  modeling  philosophy 
for  describing  task  completion  by  autonomous  agents  and  determine  the  conditions  under  which  a 
given  strategy  is  superior  to  other  proposed  strategies. 

We  have  considered  some  scenarios  that  could  be  analyzed  in  this  manner.  The  first  is  complet¬ 
ing  tasks  in  a  distributed  computing  environment  {e.g.  a  Beowulf  cluster).  Tasks  can  be  assigned 
processing  time  according  to  the  size  of  the  task,  the  task’s  deadline,  the  task’s  assigned  priority, 
or  other  factors.  The  tasks  can  be  any  problem  that  can  be  usefully  split  into  pieces  such  as  list 
sorting  or  signal  processing.  Each  task  has  an  agent  assigned  to  complete  it  by  negotiating  with 
the  other  agents  for  use  of  CPU  time. 

Another  example  is  the  DARPA-ANTS  challenge  problem.  ANTS  stands  for  Autonomous 
Negotiating  TeamS.  This  problem  is  interesting  precisely  because  of  the  possibility  of  a  decen¬ 
tralized  solution.  A  decentralized  network  presents  no  obvious  or  critical  target  for  an  enemy  to 
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focus  on,  and  can  presumably  function  just  as  well  if  a  few  nodes  are  destroyed.  In  the  ehallenge 
problem  several  radar  sensors  are  positioned  around  a  model  railroad  and  tasked  with  traeking  one 
or  more  trains.  Monitoring  a  radar  ‘traek’  for  eaeh  train  is  a  task  with  a  responsible  agent  and  the 
radar  stations  and  the  timing  of  their  emissions  are  the  resourees,  eontrolled  by  other  agents.  In 
this  problem,  as  in  many  others,  central  questions  are:  under  what  eonditions  ean  successful  task 
eompletion  be  guaranteed,  and  how  does  negotiation  overhead  influence  task  completion? 

In  this  paper,  we  will  first  present  a  eonceptual  model  ealled  the  ‘Screaming  Generals’  problem, 
whieh  will  allow  us  to  address  these  questions.  This  formulation  is  independent  of  any  speeifie 
applieation.  After  deseribing  this  problem  in  the  form  of  a  system  of  differential  equations,  we 
propose  three  resouree  alloeation  strategies  and  proeeed  to  analyze  solution  eharacteristics  .  We 
then  show  the  results  of  a  numerieal  simulation  of  the  problem,  using  the  different  strategies.  Our 
results  will  illustrate  two  points:  first,  that  our  analytieal  methods  provide  insights  into  the  nature 
and  eomplexity  of  the  problem,  and  that  we  ean  bound  the  performance  of  a  resouree  alloeation 
strategy.  Seeond,  by  using  numerieal  simulation  and  data  fitting  we  ean  determine  the  best  strategy 
for  given  eonditions. 


3.2  Modeling  Task  Completion 

3.2.1  The  Screaming  Generals  Problem 

We  are  speeifically  eonsidering  divisible  tasks,  that  is,  tasks  whose  aeeomplishment  ean  theoreti- 
eally  be  subdivided  into  many  small  (ideally  identieal)  portions.  In  our  Beowulf  eluster  example, 
this  is  true  for  list  sorting,  image  proeessing,  and  numerieal  eomputations,  among  other  things 
(these  tasks  do  not  neeessarily  have  to  be  done  in  parallel).  For  eaeh  task  this  provides  a  natural 
index  of  eompletion:  the  fraction  of  the  task  whieh  remains  undone  {Fj).  By  examining  how  this 
fraetion  deereases  in  time  we  will  be  able  to  prediet  how  different  strategies  for  resource  alloeation 
impact  the  completion  of  individual  tasks.  Furthermore  the  tasks  are  time- sensitive  in  that  they 
must  be  eompleted  by  a  eertain  deadline  or  else  be  eonsidered  total  failures.  Deadlines  are  eritical 
because  many  problems  need  to  be  solved  in  some  finite  amount  of  real  time.  The  radar  tracking 
problem,  for  instanee,  has  very  definite  deadlines  based  on  the  hardware  requirements  and  the  de¬ 
mands  of  physios  -  if  the  sensors  spend  too  muoh  time  negotiating  they  will  not  have  enough  time 
to  produoe  aocurate  traeking  results  before  the  target  moves  on. 

As  a  oonoeptual  model  for  divisible  tasks  we  think  of  ditches.  Eaeh  ditoh,  labeled  j,  requires  a 
eertain  number  of  man-hours,  Rj,  to  dig.  Eaeh  ditch  has  a  general  who  has  overall  responsibility 
for  making  sure  that  the  ditoh  gets  dug,  and  who  negotiates  for  men  with  other  generals,  all  from 
a  fixed  pool  of  M  men  on  base.  A  basio  model  for  resouree  alloeation  is  by  how  loudly  eaeh 
general  ‘yells’  in  eomparison  to  the  other  generals.  Based  on  a  variety  of  faotors  (proximity  of 
deadline,  length  of  ditoh,  ete.)  a  general  may  ohoose  to  negotiate  at  greater  or  lesser  volume.  The 
number  of  men  a  general  reoeives  on  an  hourly  basis  is  in  direot  proportion  to  the  volume  at  whieh 
the  general  is  yelling.  By  building  various  models  for  how  a  general’s  loudness  varies  with  ditch 
completeness  and  deadline  we  will  examine  how  different  negotiating  outcomes  affeot  the  rate  of 
task  eompletion. 
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Quantity 

Units 

Description 

j 

- 

Task  (ditch  index) 

N 

- 

Number  of  currently  active  tasks 

F, 

- 

Fraction  of  ditch  j  remaining  un-dug 

M 

men 

Total  number  of  men  available  to  dig  all  ditches 

fj 

- 

Fraction  of  total  resources  currently  allocated  to  task  j 

t 

hrs 

Current  time 

hrs 

Time  when  task  j  began 

D, 

hrs 

Deadline  for  completion  of  task  j 

L, 

feet 

Total  length  of  ditch  j 

feet 

Distance  currently  dug  along  ditch  j 

rj{sj) 

man-hrs/feet 

Work  density  required  to  dig  ditch  at  a  distance  Sj  along  the  ditch 

Rj 

man-hrs 

Total  number  of  man  hours  required  to  complete  task  j 

Figure  3.1:  List  of  parameters  and  variables  used. 


3.2.2  Task  Completion  Modeling 

The  fraction  of  ditch  j  remaining  to  be  dug,  Fj,  is  given  by 


Fj  = 


(3.1) 


where  Sj  is  the  distance  currently  dug  and  Lj  is  the  length  of  the  ditch.  In  the  case  of  a  ditch  with 
variable  consistency  (and  therefore  varying  difficulty  in  digging  along  its  length)  a  input-output 
constitutive  relation  holds  for  progress: 

Man-hours  required  to  dig  a  distance  As  at  a  spot  Sj  feet  into  the  ditch=  rj{sj)As  =  fjMAt  = 
Man-hours  allocated  to  task  j  for  the  amount  of  time  At  required  to  dig  a  distance  As. 

Here  rj{sj)  is  the  work  density  required  at  distance  Sj  along  the  ditch  and  fj  is  the  fraction 
of  men  M  assigned  to  task  j  as  a  result  of  negotiation.  Thus,  progress  along  the  ditch  obeys  the 
relationship 


Differentiating  (3.1)  gives 


lim  ^  = 

At->0  At 

Mf,  ^ 

dsj  M  fj 

dt  rj{sj)' 

(3.2) 

^F--- 

1  dsj 

Mfj 

(3.3) 

dt^ 

Lj  dt 

Fj^j{sj) 

In  the  case  of  a  homogeneous  ditch  (one  which  requires  equal  resource  per  distance),  the  general 
can  estimate  the  total  resource  commitment  required  to  dig  the  ditch,  Rj,  as  Rj  =  VjLj,  where  Vj  is 
constant  with  distance  along  the  ditch.  In  this  case  we  can  write 


^F- 
dt  ^ 


Rj 


,F{tf  =  l, 


(3.4) 
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where  tj  is  the  start  time  of  task  j.  In  order  to  incorporate  negotiation  into  our  model,  we  assume 
that  all  agents  spend  some  fraction  of  time,  f3,  negotiating,  where  /3  e  [/3o,  /3i\.  The  constants  /3o  and 
Pi  are  the  minimum  and  maximum  levels  of  communication  overhead,  respectively.  We  assume 
that  agents  cannot  work  on  tasks  and  negotiate  simultaneously.  Consequently,  of  each  small  time 
increment.  At,  only  (1  —  P)At  is  available  for  task  completion.  The  natural  modification  to  (3.4) 
is  therefore 


A 

dt  ^ 


-(l-P)Pp,F(t,)  =  l. 

Kj 


(3.5) 


The  negotiation  overhead,  P,  can  in  principle  depend  on  many  factors,  including  the  behavior  of 
the  agents.  For  example  assume  it  is  a  function  only  of  the  number  of  active  tasks  N.  Let  Nq  be 
the  number  of  tasks  that  result  in  half-saturation  of  the  network.  The  negotiation  fraction  could  be 
modeled  by 


P 


Po  + 


piN^ 


,a  e  N. 


(3.6) 


This  function  has  values  of  P  approaching  Pq  for  N  Nq  and  values  approaching  Pi  for  N  Nq, 
with  P  =  I  at  iV  =  Nq  and  increasing  a  creating  a  more  abrupt  transition  from  Pq  to  Pi.  For 
our  analysis  we  are  only  concerned  with  what  the  resulting  level  of  communication  actually  is, 
regardless  of  how  the  network  operates,  and  will  simply  assume  P  =  Pq.  To  find  Fj  we  need 
to  solve  the  differential  equation  (3.5).  Since  P,  M,and  Rj  are  constants,  to  finish  specifying  the 
model  need  to  know  the  fraction  of  resources  fj  assigned  to  task  j  at  any  time  t. 


3.2.3  Resource  Allocation  Models 

Rather  than  attempt  to  model  some  negotiation  scheme,  we  will  assume  it  has  a  known  determinis¬ 
tic  outcome.  A  resource  allocation  model  fj  is  a  function  that  will  be  used  in  (3.4)  to  describe  what 
fraction  of  available  resources  are  allocated  to  task  j  as  a  function  of  the  states  of  all  active  tasks 
and  the  negotiation  process.  The  models  we  will  present  are  by  no  means  the  only  possibilities. 
In  general  the  tasks  are  assigned  weights,  where  the  weights  are  determined  by  the  context  of  the 
problem.  The  weighting  could  be  determined  by  prioritizing  the  tasks,  for  example.  Again  con¬ 
sidering  the  radar  tracking  problem,  it  would  be  sensible  to  assign  a  higher  priority  to  targets  with 
a  high  velocity  or  that  threaten  more  critical  targets.  Any  conceivable  weighting  strategy  would 
work  in  our  model  as  long  as  the  weights  sum  to  one. 


Democratic  Allocation 

An  obvious  solution  to  the  problem  of  allocating  resources  is  to  divide  them  evenly.  In  the  radar 
tracking  problem  discussed  in  the  introduction,  if  we  assume  that  tracking  each  of  two  trains  is 
equally  important,  simply  have  half  of  the  available  sensors  track  each  target  (neglecting  other 
considerations  such  as  sensor  range).  In  their  work  on  the  same  problem,  S.  Fitzpatrick  and  L. 
Meertens  [14]  developed  a  negotiation  strategy  based  on  graph  coloring  that  produces  an  approx¬ 
imately  democratic  allocation.  We  mean  democratic  in  the  sense  of  fairness  to  the  participants  - 
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each  agent  receives  an  equal  share  of  the  resources 


/.= 


1 

iv’ 


(3.7) 


where  N  is  the  number  of  currently  active  tasks.  A  weighted  version  of  democratic  allocation  is 
used  to  allocate  CPU  resources  in  most  operating  systems. 


Crisis  Allocation 

Another  weighting  factor  could  be  to  give  ‘critical’  tasks  more  resources.  In  the  screaming  generals 
context  a  critical  task  is  one  that  is  close  to  its  deadline  relative  to  the  other  tasks.  The  resources 
assigned  are  distributed  according  to 


/.  =  (3.8) 

Di—t  D2—t 

The  fractions  k  E  {l,2,...,A^}area  measure  of  each  task’s  proximity  to  the  deadline, 
where  lim  fj  =  1.  For  instance  if  t  =  1  and  Di  =  3, 7^2  =  4,  Ds  =  5  then  fi  ^  0.46, /2  ~ 

t^Dj 

0.31,  /s  ~  0.23,  thus  giving  the  highest  fraction  of  resources  to  the  task  with  the  nearest  deadline. 
The  idea  of  giving  tasks  nearest  to  deadline  the  highest  priority  is  used  by  C.L.  Liu  and  J.W. 
Layland  in  their  paper  on  scheduling  tasks  on  a  single  processor  [15]. 

Opportunistic  Allocation 

Smaller  tasks  are  relatively  easy  to  finish,  and  one  (opportunistic)  approach  would  be  to  finish  them 
first.  The  fractions  k  E  {l,2,...,A^}area  measure  of  the  inverse  size  of  the  task,  and  the 
smallest  tasks  will  receive  the  most  resources  via  the  following  formula: 


f)  =  1  ,  ,  1  •  (3-9) 

Again  we  consider  a  three-task  example  with  Ri  =  R2  =  R3  =  1,  where  the  tasks  are  all  the 
same  size.  Let  Fi  =  .5,  F2  =  .4,  and  F3  =  .1,  which  results  in  fi  ^  0.14,  /2  ~  0.17,  /s  ~  0.69. 
Here  task  three  is  the  ‘easiest’  so  it  receives  the  largest  proportion  of  the  resources.  R.  Armstrong 
[16]  describe  several  opportunistic-type  strategies  for  completing  tasks  in  a  distributed  computing 
environment. 

3.2.4  Dimensional  Analysis  and  the  Critical  Start  Time 

To  obtain  a  dimension-free  form  of  (3.5)  we  let 

r  =  =  {1-  po)^Dj,pj  =  (3.10) 

ill  -^1  -rtj 
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Here  r  is  the  fraetion  of  time  elapsed  sinee  the  start  of  task  one  and  before  its  minimum  eompletion 
time.  This  implies  that  it  is  eompleted  aeeording  to 

^  =  -1,  Fi(0)  =  1  ^  Fi(r)  =  1  -  r.  (3.11) 

We  ean  see  that  this  task  will  take  one  unit  of  dimensionless  time  to  eomplete,  provided  no  other 
task  interferes. 

A  eentral  feature  of  the  sereaming  generals  problem  is  that  the  agents  have  no  prior  knowledge 
of  the  tasks  they  will  be  assigned.  If  they  did,  the  resouree  alloeation  problem  eould  presumably 
be  solved  using  a  more  sophistieated  method.  Thus,  we  will  assume  that  from  the  agents’  point 
of  view  the  tasks  assigned  are  random  in  start  time,  deadline,  and/or  size.  In  terms  of  our  di¬ 
mensionless  parameters  we  will  say  that  these  parameters  have  a  uniform  distribution  given  by 
7~j  f/[0,T],5,  f/[0,  //],  and  pj  ~  U[0,R],  respeetively,  where  T,  Z/,and  R  are  the  maximum 

values  for  eaeh  parameter.  There  are  several  reasons  for  our  ehoiee  of  a  uniform  distribution.  Uni¬ 
form  distributions  are  easier  to  analyze  and  very  easy  to  simulate.  In  addition,  without  any  knowl¬ 
edge  of  the  speeifie  tasks  the  agents  will  be  solving,  a  uniform  distribution  is  the  fairest  to  use  in 
evaluating  the  performanee  of  different  resouree  alloeation  strategies,  and  gives  ample  opportu¬ 
nity  to  test  the  effeet  of  extreme  parameter  ehoiees  and  interaetions  among  extremes.  However,  in 
prineiple  there  is  no  reason  an  arbitrary  distribution  eannot  be  used. 

Sinee  task  parameters  are  random  we  need  to  quantify  the  probabilities  of  any  events  we  are 
interested  in.  The  probability  one  task  will  sueeeed  (or  fail)  in  isolation  is  something  we  want  to 
know.  From  a  eombinatorial  perspeetive  there  are  a  huge  number  of  possible  events,  with  various 
numbers  of  existing  tasks,  new  tasks,  and  sueeessful/failed  tasks.  A  possible  generalization  is  to 
maintain  a  running  eount  of  task  probabilities.  For  instanee,  suppose  we  know  the  probability  of 
one  task  sueeeeding  in  isolation.  Then  we  add  a  task  and  find  the  probability  that  the  addition  of 
a  new  task  will  eause  the  first  task  to  fail.  We  would  then  need  to  find  the  probability  that  the  new 
task  ean  be  sueeessful  given  the  existenee  of  the  first  task. 

This  method  ean  be  extended  to  an  arbitrary  number  of  tasks,  assuming  the  deadlines  are  or¬ 
dered  aeeording  to  (5i  <  ^2  <  •  •  •  <  whieh  is  reasonable  if  we  allow  the  tasks  to  be  re-indexed 
(and  sinee  time  is  a  eontinuous  variable  P{5i  =  5j)  =  0).  If  this  ordering  holds  then  the  first  task 
that  will  fail  is  task  one.  So  the  only  new  quantities  to  eompute  with  the  addition  of  task  N  +  1 
are  the  probability  that  task  one  will  fail  and  the  probability  that  task  N  +  1  will  sueeeed  given 
the  existenee  of  N  tasks.  This  is  a  simplifieation  -  it  is  possible  that  one  of  the  tasks  will  sueeeed 
before  task  one  fails,  thus  redueing  the  system  to  N  tasks.  However,  this  will  inerease  the  available 
resourees,  making  our  previously  eomputed  value  for  the  probability  of  task  one  failing  an  over¬ 
estimate  and  the  probability  for  task  N  +  1  sueeeeding  an  under-estimate.  This  method,  then,  will 
eonsistently  under-estimate  the  probability  of  sueeessful  task  eompletion. 

In  order  to  find  these  probabilities  we  introduee  the  eoneept  of  a  critical  start  time,  r*.  Consider 
the  two-task  ease  and  assume  task  one  will  be  sueeessful  in  isolation,  whieh  implies  that  there  exists 
some  time  r  sueh  that  r  <  (5iandF(r)  =  0,  i.e.  the  task  will  finish  before  or  at  the  deadline.  When 
task  two  is  introdueed  at  time  T2  it  will  eause  task  one  to  be  eompleted  later  than  it  would  have  in 
isolation,  beeause  task  two  is  now  using  some  of  the  resourees  task  one  was  using.  It  follows  that 
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by  making  T2  earlier  and  using  more  of  task  one’s  resources,  we  will  eventually  cause  task  one  to 
finish  exactly  at  its  deadline,  meaning  that  Fi((5i)  =  0.  This  value  of  T2  is  the  critical  start  time 
for  task  two  and  will  be  labeled  .  Now  we  can  formulate  the  probability  of  success  for  task  one 
given  the  addition  of  task  two  as  P{t2  >  ).  If  task  two  starts  before  the  critical  time  it  will  cause 

task  one  to  fail  by  consuming  too  many  of  the  resources.  In  general  is  the  critical  start  time 
of  the  newest  task  that  could  cause  task  one  to  fail. 


Democratic  Allocation 


If  iV  =  2  and  the  tasks  have  start  times  ri  and  T2,  we  have  a  linear  system 


dFi  =  -^dt  =  -\dT^  ^ 

dF2  =  -^^-f^dt  =  -\p2dT^  ^ 


-i  F^{t2) 
-f,  F2{t2) 


1  -Ta, 

1. 


T2  >  Ti 


0, 


(3.12) 


which  has  the  solution 


Fi  =  I  -  \t  -  \t2,  F2  =  I  -  fr  +  fT2  ,  0  <  r  <  2  -  Ta.  (3.13) 

The  critical  start  time  in  this  case  is 

F,{6,)  =0  =  1-  ^{61 +t;)  ^  r*  =  2  -  <ii.  (3.14) 

Consider  a  specific  example  where  pa  =  1,  <^1  =  |,Ta  =  This  situation  is  illustrated  in 

figure  (3.2). 

In  general  if  iV  =  n  the  system  we  have  the  following  by  induction  on  (3.12): 


dFi  _ 

dr 

dF2  ^ 

dr 

dFs  ^ 

dr 


dFn-\ 

dr 

dFn  _ 
dr 


_1 

n  ’ 

Fi\ 

X)  = 

■■  1 

— 

n-l  'fl 

-  F-t 

n-l 

1  ~  • 

•  -  X2, 
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+ 
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—  T 
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—  PI 

n  ’ 

F3I 
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:  1 

+ 

P^  n- 
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n 
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-iX) 
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1 
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'  n-l 

^  Pn-l 

'fi  n-l 

■^n— 1, 

_ pn 

n  ’ 

Fn 

X)  = 

:  1 

(5l  >  Tn  >  .  .  .  >  Ta  >  Ti  =  0, 


(3.15) 

All  n  tasks  must  start  before  the  first  deadline,  otherwise  the  system  really  has  n  —  1  or  fewer  tasks. 
Using  induction  we  obtain  the  solution 


Fi  =  1-^t-\t2-\t^- 
F2  =  l-fr+fra-fr3-... 
F3  =  l-^r+fr3-f|r4-... 


1  _ 

{n-l)(n-2y^-^ 


Fn-,  =  1  - 

F„  =  l-liT+  <^T„. 


Pn-1 

n{n—l)  Fii 


n(n— 1) 

_ 

n{n—l)  Fii 

_ 

n(n— 1)  Fit 


(3.16) 
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Task  Completion  in  the  Two-task  Democratic  Model 


Figure  3.2:  Task  completion  under  the  democratic  resource  allocation  model.  Due  to  task  two  start¬ 
ing  at  its  critical  time  r^,  task  one  finishes  exactly  at  deadline.  The  non-dimensional  parameters 

are  p2  =  l,6i  =  |,  T2  = 


Setting  Fi((5i)  =  0  gives  the  critical  start  time  for  task  n  : 


T*  =  n{n  -  1)  -  (n  -  l)(5i  - 


n[n 


1)  n{n 
- T2  — 


1) 


Tg  -  .  .  .  - 


n 


.Tn—l- 


(3.17) 


2  6  n-2 

This  gives  us  an  appreciation  for  the  complicated  nature  of  the  solution  to  these  systems,  even  with 
an  extremely  simple  resource  allocation.  At  a  minimum  we  must  keep  track  of  n  initial  conditions, 
each  of  which  changes  with  the  addition  or  removal  of  a  task.  And  as  we  will  see  the  solution  can 
be  more  difficult  or  even  impossible  to  obtain  in  the  case  of  other  strategies. 


Crisis  Allocation 

The  next  most  complicated  allocation  strategy,  crisis  allocation,  generates  linear  equations,  but 
with  non-constant  coefficients.  We  first  simplify  the  crisis  equations  as  follows: 


dFi  _ 

-(1 

-/5)f 

f  D2-t 

dt 

1  Z)i+Z)2 — 2t 

dF2  ^ 

-(1 

-m' 

(  Di-t 

dt 

1  Di+D2—2t 

-^2(^2)  =  1- 


Removing  t  from  the  numerator  to  facilitate  integration  gives 


dFi  _ 

-(1 

D2-D1  ) 

)  Fi(t2 

dt 

Di-\-D2—2t  } 

dF2  ^ 

-(1 

1  + 

D1-D2  \ 

)  -^2(^2 

dt 

Di-\-D2  —  ‘2t  1 

,  M . 


(3.18) 


(3.19) 


23 


To  non-dimensionalize  we  multiply  by  one  in  several  places; 


dFx 

dt 


dF2 

dt 


(i  + 


(1-/3)|^(D2-Di) 
(l-/3)|^(Di+D2-2t) 


(1-/3)|^(Di-D2) 

(l-/3)|^(Di+D2-2t) 


Fi(t2)  =  l-(l-/9)ft2, 

-^2(^2)  =  1- 


(3.20) 


which  results  in  the  dimensionless  form 


dFi  _  _ 1  ^  1  _|_  S2—S1  ^ 

dr  ^  \  (5i+52  — 2r  J  ’ 

dF2  _  _ ^  (1-1-  ^1  —  ^2  ) 

dr  ^  \  (5i+(52  — 2r  J 


Fi{t2)  =  1-  T2 

T-2  >  Ti  =  0, 

F2(r2)  =  1. 


(3.21) 


Solving  for  Fi  and  F2  gives 


F2 


1-1 
1  _  £2 

2 


T  +  T2  +  ^  log 
T  +  T2  +  ^  log 


(5i+^2  — 2r  3 
<5i+52— 2t2  y 

5i+52  —  2t 
c5i+(52  — 2t2 


(3.22) 


We  observe  that  Fi((5i)  =  0  is  transcendental  in  T2,  making  it  impossible  to  find  in  closed  form. 
A  one-term  Taylor  approximation  of  the  log  term  allows  us  to  find 


1 

1  +  (5i  +  (52 


2  —  (5i  + 


log 


(S2-6A-] 

U2  +  (5JJ  ’ 


(3.23) 


which  is  valid  when  2t2  (5i  +  (52. 

While  it  is  possible  to  solve  the  n-task  equation,  it  becomes  increasingly  difficult  to  simplify 
the  fractions  as  n  increases.  And  as  we  saw  with  the  democratic  allocation,  the  initial  conditions 
are  complicated  as  well.  Given  the  increasing  degree  of  analytic  complexity,  it  is  doubtful  that 
direct  characterization  of  t*  and  calculation  of  P{Tj  >  r*)  to  assess  success  probabilities  will  be 
more  illuminating  than  direct  simulation. 


Opportunistic  Allocation 

Potentially  the  most  complicated  allocation  strategy  is  opportunistic,  with  model  equations  which 
are  fully  non-linear.  Simplifying  the  opportunistic  equations  (3.5)  and  (3.9)  with  n  =  2  gives 


dFi  _  _ /I  _  (  R2F2  3 

dt  1  RiFi+rj2F2  1  ’ 

dF2  _  _ /I  _  fF\M-  (  RiFi  \ 

dt  yR^Fi+R2F2  J  ^ 

Factoring  yields 


dFi  _ 

-(1 

-  3)^^  1 

F)  Ri  R2  1 

[  F2 

dt 

dF2  ^ 

-(1 

_  3)M.Rl  I 

F)  R2  Ri  1 

(  Fi 

dt 

^A  +  5fF2 

F,{t2)  =  l-{l-f3)ft2, 
F2{t2)  =  1. 


Fi(t2)  =  l-(l-/5)ff2, 

F2{t2)  =  1. 


(3.24) 


(3.25) 
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Applying  our  parameter  definitions  to  the  opportunistie  alloeation  gives 


dFi  _  _ 

F2 

Fi{t2) 

dr 

P2F1+F2  ’ 

pIFi 

dF2  _  _ 

P2F\ 

F2{t2) 

dr 

A  +  - 

P2F1+F2  ’ 

Using  Maple  we  find  the  solution  to  be 


F.  — 

-i  1 

f  P2F -‘i{l+p2)T+2p2-2T2p2+T2  P2+‘2 

2 ' 

1  P2T-1-P2 

Fo  — 

-f  1 

f  p%T^-2p2{l+p2)T+2p2+2T2p2-r2p% 

F2  — 

2 ' 

\  P2T-1-P2 

Sl>T>  T2, 


Setting  Fi{Si)  =  0  we  can  solve  for  using  the  quadratic  formula  to  obtain 


(3.26) 


(3.27) 


r*  =  P2±  \JpI-  2p2  -  26iP2  -  +  26^  +  2.  (3.28) 

We  will  use  the  earlier  of  the  two  times,  unless  one  is  negative  or  imaginary,  then  we  will  use 
the  non-negative  one.  As  with  the  crisis  model,  the  solution  of  the  n-task  system  of  equations  is 
increasingly  difficult  to  obtain  and  decreasingly  informative. 

In  this  section  we  have  shown  a  method  for  obtaining  the  critical  start  time  for  the  last  task, 
which  is  the  minimum  information  necessary  to  compute  the  probability  a  task  will  succeed  or  fail, 
in  the  ‘worst’  case.  We  will  now  use  these  r*  to  find  bounding  probabilities  for  task  failure. 


3.3  Task  Completion  Probabilities 

3.3.1  One  Active  Task 

We  have  a  condition  for  task  failure  determined  by  (3.11):  if  the  deadline  (5i  is  less  than  one,  the 
task  will  fail.  Otherwise,  it  will  succeed.  Suppose  (5i  is  random  according  to  some  probability 
density  function  ps  such  that  Sj  G  [0,  U]  where  U  >  1  (to  guarantee  a  non-zero  probability).  Then 
the  probability  that  the  task  is  a  success  is 


P{Si  >  1)  =  f  ps{s)ds. 


For  example,  if  we  let  ps  be  uniform  on  [0, 2]  then  (3.29)  evaluates  to 


(3.29) 


(3.30) 


which  is  an  intuitive  result  since  it  is  equally  likely  for  the  deadline  to  come  before  or  after  r  =  1. 
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3.3.2  Two  Active  Tasks  -  Democratic 

Let  us  assume  that  the  first  task  will  succeed  if  left  to  its  own  devices,  i.e.,  (5i  >  1.  With  two  tasks 
running  under  the  democratic  regime  we  compare  the  start  time  of  the  second  task  T2  to  the  critical 
start  time  we  obtained  in  (3.14).  If  T2  <  then  task  one  will  fail  due  to  too  many  resources 
being  consumed  by  the  other  task.  Suppose  T2,  the  actual,  start  time,  is  randomly  distributed  with 
density  pr  with  r  G  [0,  T]  where  2  >  T  >  0.  Then  the  probability  of  success  is 

P{t2  >  r*)  =  P{t2  >  2  -  5i)  =  [  Pr{s)ds.  (3.31) 

J  2—5i 

Again  considering  a  uniform  probability  on  [0,  2]  for  pr  gives  a  probability  of  success 


This  result  is  intuitive  if  we  keep  in  mind  it  is  conditioned  on  the  success  of  the  first  task  in  isolation. 
If  (5i  =  1  then  the  probability  of  the  first  task  succeeding  is  now  f .  If  (5i  =  2  then  the  probability 
of  the  first  task  succeeding  is  one,  since  it  will  take  two  units  of  dimensionless  time  to  complete  in 
the  worst  case  (t2  =  0). 

3.3.3  Two  Active  Tasks  -  Crisis  and  Opportunistic 

Under  the  crisis  model  we  found  that  it  is  not  possible  to  solve  for  explicitly.  However,  given 
a  value  for  we  can  find  the  level  curve  associated  with  Fi(5i)  =  0.  We  can  then  integrate  over 
the  region  where  T2  >  r^.  For  example  of  a  level  curve  with  (5i  =  1,  ^2  ~  U[l,  7]  and  T2  ~ 
t/[0, 2]  results  in  a  probability  of  success  of  approximately  0.3175  using  Monte-Carlo  integration. 
Applying  the  same  parameters  and  method  to  the  opportunistic  case  with  p2  ~  C[0, 1]  we  find  a 
success  probability  of  0.1826. 

For  the  two-task  case  we  needed  to  use  numerical  methods  -  one  to  find  the  level  curve  and 
one  to  find  the  area  of  the  region.  With  three  or  more  active  tasks  we  would  need  to  find  a  level 
surface  and  then  integrate  over  the  proscribed  volume,  in  three-dimensional  or  higher  space.  As  we 
continue  to  add  tasks,  computing  these  probabilities  becomes  more  and  more  expensive  in  terms 
of  processor  time.  In  addition,  our  analytical  methods  become  much  more  difficult  with  additional 
tasks.  The  combination  of  these  two  factors  leads  us  to  attempt  a  simulation  of  a  series  of  random 
tasks  and  analyze  the  numerical  results,  as  opposed  to  the  more  expensive  (and  cumbersome) 
numerical  realization  of  analytic  results. 

3.3.4  Numerical  Simulation 

In  our  numerical  simulation  we  return  to  the  dimensional  model  in  equation  (3.5).  For  each  re¬ 
source  allocation  strategy  we  will  simulate  random  populations  of  tasks  and  measure  the  number 
of  failures  and  successes  in  each.  After  running  a  large  number  of  these  simulations  the  average 
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proportion  of  successful  tasks  for  each  strategy  will  be  determined,  providing  us  a  metrie  for  eom- 
paring  the  effieieney  of  the  strategies.  We  will  also  vary  deadline,  eommunieation  overhead,  and 
task  loading  in  order  to  make  some  eonelusions  eoneeming  the  dynamies  of  the  sereaming  gen¬ 
erals  problem.  The  values  of  the  other  parameters  will  be  arbitrary.  The  average  proportion  of 
sueeessful  tasks  is  plotted  for  all  three  strategies  over  a  wide  range  of  task  densities.  A  sample  run 
is  shown  in  figure  3.3. 


Successful  Tasks  as  a  Function  of  Task  Loading 


Figure  3.3:  Average  proportion  of  sueeessful  tasks  over  60  simulations  per  task  loading  (horizontal 
axis)  plotted  by  resouree  alloeation  strategy.  The  solid,  dashed,  and  dotted  lines  are  the  demoeratie, 
erisis,  and  opportunistie  values,  respeetively.  Note  the  point  where  the  erisis  eurve  erosses  the 
opportunistie  eurve. 


The  predieted  values  from  the  previous  seetions  are  plotted  in  approximately  the  eorreet  loea- 
tion.  If  there  are  two  tasks  in  a  two  time  unit  region,  it  follows  that,  on  average,  the  task  loading 
is  approximately  50  over  a  50  time  unit  duration.  For  eaeh  task  loading,  N,  over  50  time  units, 
a  random  veetor  of  N  start  times  is  ehosen  uniformly  on  [0,  50]  with  associated  deadlines  ehosen 
uniformly  on  [Start,  Start +Zi)]  where  D  is  some  maximum  deadline.  Any  deadline  whieh  is  greater 
than  50  is  set  to  50.  The  system  of  ODE’s  deseribed  in  (3.5)  is  solved  using  the  4*  order  Runge- 
Kutta  method.  The  status  of  eaeh  task  at  its  deadline  is  assessed,  and  the  total  number  of  sueeesses 
and  failures  is  reeorded. 

To  eharaeterize  simulation  output  we  need  a  fit  whieh  varies  smoothly  from  zero  to  one  as  the 
input  varies  from  one  to  infinity.  In  addition,  initial  explorations  indieate  power-law  behavior  in 
the  tails  of  the  results.  We  therefore  ehoose 


kx^ 

1  -f  kx’^ 


(3.33) 
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as  the  model  function,  where  tt  is  the  proportion  of  successful  tasks,  x  is  the  task  loading,  and  k 
and  b  are  constants.  This  model  is  equivalent  to 


log 


TT 


1  —  TT 


a  +  b  logx, 


(3.34) 


where  k  =  e“.  Using  linear  regression  on  this  model  we  can  determine  k  and  b  in  (3.33)  for 
each  average  success  curve.  The  model  fit  to  the  curves  in  figure  3.3  is  shown  in  figure  3.5. 
The  correlation  coefficient  values  are  ^  0.99  for  democratic,  ^0.99  for  opportunistic,  and 
«i0.94  for  crisis. 


Power  Law  Fit  to  Success  Curves 


Figure  3.5:  Power  law  fit  to  the  success  curves  in  figure  3.3,  using  the  equation  described  in  the 
text.  The  communication  overhead  is  /9  =  0,  the  maximum  deadline  is  D  =  6,  and  the  tasks 
randomly  have  sizes  between  0  and  100  percent  of  the  available  resources.  This  fit  was  designed 
to  match  the  true  values  asymptotically  with  increasing  task  size.  Consequently  it  matches  the 
true  values  well  over  most  of  the  domain,  but  in  the  0  to  50  region  it  overestimates  the  success 
probabilities. 


Our  model  also  diagnoses  the  asymptotic  rate  of  decay  of  the  average  success  proportion  with 
increasing  task  load  to  be  0{x^).  In  figures  3.6  and  3.7  we  allowed  the  communication  time  (3  to 
vary  between  0  and  0.95,  and  the  maximum  deadline  to  vary  between  1  and  20,  respectively  when 
conducting  the  simulations.  The  model  fit  exponent  b  is  plotted  against  these  parameters. 

From  these  plots  we  can  see  that  both  the  democratic  and  opportunistic  strategies  are  fairly 
consistent  and  behave  approximately  as  0{x~^).  The  crisis  strategy  is  more  erratic  and  decays 
approximately  in  the  0{x~‘^'^)  to  0{x~^)  range.  Clearly  crisis  is  the  asymptotic  loser  in  this 
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Power  Law  Exponent  as  a  Function  of  Communication  Time 


p 


Figure  3.6:  Plot  of  power  law  exponents  against  communication  overhead.  The  democratic  and 
opportunistic  strategies  decay  consistently  over  different  communication  overheads  approximately 
on  the  order  of  Crisis  decays  erratically  over  the  domain  between  the  orders  of  X  ^  and  X 
The  sharp  increase  at  /3  =  0.85  occurs  not  because  the  strategy  gets  better,  but  because  it  no  longer 
completes  any  tasks,  and  therefore  generates  a  zero  fit  parameter.  Crisis  has  the  worst  asymptotic 
behavior  of  the  three  strategies. 


analysis,  although  there  is  a  range  of  low  task  loading  in  which  it  performs  slightly  better  than  the 
opportunistic  model,  as  shown  in  figure  3.3.  Also,  it  would  appear  from  figures  3.6  and  3.7  that 
opportunistic  allocation  is  asymptotically  better  than  democratic,  since  it  has  the  exponent  smallest 
in  magnitude.  But  in  figure  3.3  we  see  that  it  is  uniformly  worse  than  democratic  in  this  case.  This 
is  not  necessarily  a  contradiction  as  the  decay  exponent  does  not  take  into  account  the  vertical- 
axis  intercepts  of  the  curves.  As  we  will  see  there  are  regions  of  the  parameter  space  (usually  at 
high  task  loads)  in  which  opportunistic  does  out-perform  democratic  by  a  small  margin,  as  well  as 
regions  (very  low  task  loads)  in  which  crisis  does  better  than  opportunistic. 

Generally,  for  any  choice  of  simulation  parameters,  as  load  increases  there  is  first  a  regime  in 
which  democratic  is  superior,  followed  by  opportunistic.  Crisis  starts  out  better  than  opportunistic, 
then  switches  places  at  a  fairly  low  task  density.  Characterizing  these  switches  is  the  object  of  the 
next  section. 

3.3.5  Crossover  Points 

Figures  3.3  and  3.5  illustrate  what  we  will  call  a  ‘crossover  point,’  where  two  average  success 
curves  cross.  We  see  that  the  crisis  strategy  is  superior  to  the  opportunistic  strategy  until  it  reaches 
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Power  Law  Exponent  as  a  Function  of  Maximum  Deadiine 


Figure  3.7:  Plot  of  power  law  exponents  against  maximum  deadline.  This  is  essentially  the  inverse 
of  figure  6.  This  is  because  tasks  are  easier  with  increasing  deadline,  as  opposed  to  harder  with 
increasing  communication  time.  Again,  we  see  that  the  crisis  strategy  has  the  worst  asymptotic 
behavior  over  the  domain. 


a  crossover  point  after  which  the  opportunistic  strategy  is  better.  From  the  same  data  used  to  obtain 
the  power  law  exponent  in  figures  3.6  and  3.7  we  measured  this  crossover  point  to  the  nearest  five- 
task  unit  on  both  the  actual  curve  and  our  fitted  curve.  These  points  are  plotted  in  figures  3.8  and 
3.9. 

As  (3  increases  or  the  maximum  deadline  D  decreases,  opportunistic  eventually  outperforms 
democratic.  This  is  shown  in  figures  3.10  and  3.11.  In  these  figures  the  predicted  values  match 
the  actual  values  much  closer  than  in  figures  3.6  and  3.7.  This  is  due  to  the  model  we  used  -  we 
wanted  it  to  match  the  actual  results  in  the  high  task-loading  regime,  in  order  to  find  the  decay 
exponents.  Consequently  it  does  not  match  as  well  in  the  low  task-loading  regime  where  the  crisis- 
opportunistic  crossover  occurs.  For  any  two  predicted  curves  with  fit  parameters  ki  and  /c2,  and 
exponents  bi  and  62,  the  crossover  location  in  terms  of  task  loading  is  given  by 

(335) 

We  conclude  that  there  are  regions  in  the  parameter  space  in  which  the  democratic  method  is  the 
most  successful,  and  other  regions  where  opportunistic  is  better.  The  regions  where  opportunistic 
is  superior  have  difficult  tasks  due  to  significant  communication  overhead  or  short  deadlines,  and 
heavy  task  loads.  The  crisis  strategy  is  nowhere  superior  to  the  democratic,  and  this  fact  in  addition 
to  its  larger  decay  exponent  make  it  definitively  the  worst  of  the  three  strategies.  This  should  not 
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Crisis  to  Opportunistic  Crossover  Point  as  a  Function  of  Communication  Time 
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Figure  3.8:  Crossover  points  where  the  opportunistic  strategy  becomes  superior  to  the  crisis  strat¬ 
egy  as  a  function  of  communication  overhead  (3,  where  the  max  deadline  is  D  =  6.  Both  the  actual 
crossover  points  and  the  points  predicted  by  our  power  law  fit  are  plotted.  As  mention  in  the  text, 
the  fitted  model  is  designed  to  accurately  predict  the  asymptotic  behavior  of  the  strategies  with 
increasing  task  density.  These  crossover  points  occur  at  small  task  loads,  where  the  model  does 
not  fit  as  accurately.  Consequently,  the  predicted  values  are  not  very  accurate. 
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Crisis  to  Opportunistic  Crossover  Point  as  a  Function  of  Maximum  Deadiine 
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Figure  3.9:  Crossover  points  where  the  opportunistic  strategy  becomes  superior  to  the  crisis  strat¬ 
egy  as  a  function  of  maximum  deadline  D,  where  the  communication  overhead  is  (3  =  0.  Both 
the  actual  crossover  points  and  the  points  predicted  by  or  power  law  fit  are  plotted.  As  mention  in 
the  text,  the  fitted  model  is  designed  to  accurately  predict  the  asymptotic  behavior  of  the  strategies 
with  increasing  task  density.  These  crossover  points  occur  at  small  task  loads,  where  the  model 
does  not  fit  as  accurately.  Consequently,  the  predicted  values  are  not  very  accurate. 
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Figure  3.10:  Crossover  points  where  the  opportunistic  strategy  becomes  superior  to  the  democratic 
strategy  as  a  function  of  communication  time  (3,  where  the  max  deadline  is  D  =  6.  Both  the  actual 
crossover  points  and  the  points  predicted  by  or  power  law  fit  are  plotted.  As  mention  in  the  text, 
the  fitted  model  is  designed  to  accurately  predict  the  asymptotic  behavior  of  the  strategies  with 
increasing  task  density.  These  crossover  points  occur  at  large  task  loads,  where  the  model  does  not 
fit  as  accurately.  Consequently,  the  predicted  values  are  accurate.  The  sharp  increase  is  due  to  the 
fact  that  no  crossover  occurs  before  (3  =  0.5. 


be  too  surprising  as  common  sense  dictates  against  leaving  things  until  the  last  minute. 

Furthermore,  increasing  [3  seems  to  generate  results  that  are  the  inverse  of  increasing  D.  If  we 
look  at  the  dimensionless  deadline  5j  =  {1  —  (3)  ^Dj,  we  can  see  the  reason  for  this.  Increasing  [3 
makes  5  smaller,  thus  increasing  the  probability  of  task  failure,  whereas  increasing  the  maximum 
deadline  D  will  increase  5  on  average.  Using  the  same  reasoning  it  seems  obvious  that  increasing 
the  number  of  resources  M  will  make  things  easier,  while  increasing  the  maximum  task  size  R  will 
make  things  harder.  This  also  squares  with  our  intuition  about  the  problem. 

3.4  Conclusion 

In  this  chapter  we  have  demonstrated  a  framework  for  analyzing  a  large-scale  system  with  multiple 
independent  agents.  Our  analysis  is  independent  of  the  type  of  task  being  performed,  as  long  as 
that  task  is  homogeneous  with  respect  to  resource  use  (each  member  of  our  pool  of  M  men  is 
equally  skilled)  and  difficulty  (each  section  of  ditch  is  equally  difficult).  The  actual  procedure  for 
negotiation  can  also  be  ignored  if  it  can  be  cast  in  the  form  of  a  resource  allocation  strategy  with  an 
associated  characterization  of  the  communication  and  negotiation  costs  ((3).  While  our  analytical 
methods  are  eventually  bogged  down  by  increasing  the  number  of  tasks,  they  do  provide  valuable 
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Democratic  to  Opportunistic  Crossover  Point  as  a  Function  of  Maximum  Deadline 
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Figure  3.11:  Crossover  points  where  the  opportunistic  strategy  becomes  superior  to  the  democratic 
strategy  as  a  function  of  maximum  deadline  D,  where  the  communication  overhead  is  /9  =  0.  Both 
the  actual  crossover  points  and  the  points  predicted  by  or  power  law  fit  are  plotted.  As  mention  in 
the  text,  the  fitted  model  is  designed  to  accurately  predict  the  asymptotic  behavior  of  the  strategies 
with  increasing  task  density.  These  crossover  points  occur  at  large  task  loads,  where  the  model 
does  not  fit  as  accurately.  Consequently,  the  predicted  values  are  accurate.  The  sharp  decrease  is 
due  to  the  fact  that  no  crossover  occurs  after  D  =  3. 
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insight  into  the  structure  and  complexity  of  the  problem. 

We  might  also  consider  applying  our  results  to  task  completion  in  real  life.  In  general,  demo¬ 
cratic  allocations  Thperform  well  -  in  the  real  world  one  could  apply  this  by  giving  equal  time  to  all 
tasks,  all  other  things  being  equal.  From  the  authors’  experience,  this  is  perhaps  an  unrealistic, 
‘ideal’  strategy.  Usually  pressure  from  deadlines  forces  one  to  adopt  either  a  crisis  or  an  oppor¬ 
tunistic  allocation.  Our  results  show  that  crisis  can  be  a  successful  strategy  at  low  task  densities. 
But  as  the  number  of  tasks  increases,  someone  operating  under  a  crisis  management  strategy  be¬ 
comes  increasingly  stressed  and  inefficient.  Opportunistic,  on  the  other  hand,  is  nearly  as  effective 
as  democratic,  and  under  very  large  task  densities  becomes  more  efficient.  If  someone  has  more 
work  than  can  possibly  be  completed  on  time,  it  makes  sense  to  finish  the  smallest  tasks.  While  we 
would  be  unjustified  to  make  broad  conclusions  about  the  real  world  from  our  results,  the  results 
do  seem  to  square  with  our  intuition  about  how  people  get  things  done  under  time  constraints. 

The  results  of  our  simulations  show  that  on  average,  either  the  democratic  or  the  opportunistic 
strategy  outperform  the  other  two  strategies  depending  on  the  parameters  chosen.  Generally  an 
opportunistic  allocation  becomes  beneficial  at  higher  task  loads,  when  the  only  tasks  that  have 
a  good  chance  to  finish  are  the  smaller  ones.  Another  research  goal  might  be  to  discover  the 
optimal  strategy  for  a  given  set  of  parameters.  The  existence  and  movement  of  crossover  points  also 
suggests  that  there  are  some  interesting  dynamics  in  this  problem  that  could  be  further  explored. 

Another  option  is  to  explore  an  extension  of  this  problem  to  the  case  where  each  ditch  is  located 
at  a  specific  point  in  space,  and  there  are  costs  associated  with  moving  resources  from  one  ditch 
to  another.  Dependencies  among  tasks  are  an  additional  complication  -  in  the  real  world  often 
tasks  must  be  completed  in  a  particular  order.  Logistics  problems  might  be  a  good  test  bed  for  this 
analysis.  Our  methods  could  also  be  applied  to  the  challenge  problem  discussed  in  the  introduction, 
where  the  resources  have  a  fixed  location  and  the  tasks,  or  targets,  move  through  the  system. 


3.5  A  Track  Quality  Model  for  Distributed  Sensing  Networks 

3.5.1  Introduction 

A  network  of  distributed  sensing  and/or  computing  resources  provides  an  obvious  advantage  for 
military  applications  -  it  can  continue  to  function  if  some  or  possibly  even  most  of  the  resources 
are  destroyed.  In  this  section  we  present  a  model  for  estimating  the  ability  of  the  sensors  to  track  a 
target  as  a  function  of  the  amount  of  time  the  sensors  spend  negotiating.  Necessarily  we  make  some 
assumptions  about  the  behavior  of  the  sensors  that  do  not  correspond  to  any  real-world  hardware. 
We  introduce  a  measure  of  quality,  Q,  that  is  relative  to  a  perfectly  accurate  track.  This  quality 
measure  has  no  physical  meaning  -  rather,  it  allows  us  to  make  qualitative  judgments  about  the 
amount  of  time  the  sensors  should  spend  negotiating. 
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3.5.2  Parameters  and  Variables 


Quantity 

Units 

Description 

J 

- 

Target  index 

R 

length 

Detection  radius  of  a  sensor  (assumed  to  be  the  same  for  all 
sensors) 

Xj 

- 

Position  of  target  j  in  two-dimensional  Euclidian  space 

A, 

length^ 

Area  in  which  any  sensor  can  detect  target  j 

— 

{k\Aj  intersects  Ak}  Note:  The  Tj  are  not  unique,  e.g.  if  Ai 
intersects  A2,  Xi  =  X2  =  2} 

length^ 

Area(Xj  ),  the  area  in  which  every  target  in  X,  can  be  detected 

^(a) 

sensors/length^ 

Sensor  density  over  some  area  a 

M, 

sensors 

S{Aj)  *  Aj,  number  of  sensors  that  can  track  j  and  another 
target 

u 

— 

Amount  of  time  spent  in  negotiation  and  associated  overhead 
per  At 

/jM 

— 

Fraction  of  Mj  allocated  to  j  as  a  function  of  negotiation 
time 

rrij 

sensors 

S{Aj){Aj  +  {fj  —  l)Aj),  number  of  sensors  allocated  to  j 
after  negotiation 

Qj{t) 

quality 

Quality  of  track  j  as  a  function  of  time 

qj{mj) 

quality/time 

Quality  added  to  track  j  per  At  as  a  function  of  the  number 
of  sensors  allocated 

X 

1/time 

Quality  degradation  factor 

3.5.3  Modeling  and  Assumptions 

The  following  rate  equation  is  used  to  model  the  change  in  track  quality  over  time: 


^  -u)-  (3.36) 

So  the  track  quality  is  altered  at  any  time  by  the  quality  added  and  the  quality  lost,  being  the  first 
and  second  terms  of  (3.36)  respectively.  The  quality  added  is  controlled  directly  by  the  time  spent 
negotiating  oj,  and  indirectly  by  the  results  of  negotiation  rrij.  Quality  lost  is  the  current  quality 
multiplied  by  the  degradation  factor  A. 

We  will  assume  the  sensor  density  is  fairly  dense  and  uniform,  so  the  density  is  a  constant 
S  S{a)ya.  This  assumption  may  or  may  not  be  reasonable  for  actual  hardware.  The  area  of 
intersection  Aj  is  a  function  of  the  x/s,  which  are  assumed  to  be  known  with  perfect  accuracy 
at  any  time.  In  the  real  world  there  would  likely  be  some  error,  but  this  would  be  an  engineering 
problem  specific  to  the  hardware  in  question. 

So  the  only  variable  within  control  of  the  system  is  u.  Our  idea  is  that  fj  increases  as  u 
increases,  since  more  time  spent  negotiating  should  result  in  more  sensors  allocated  to  the  target. 
However,  no  specific  behavior  of  this  function  is  required  by  the  model,  so  any  function  could 
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be  used.  As  an  empirical  example,  consider  the  results  of  a  graph  coloring  algorithm  outlined  in 
a  technical  report  by  the  Kestrel  Institute  (May  2001).  This  algorithm  implements  a  democratic 
allocation  -  each  target  receives  an  equal  share  of  the  system  resources.  However,  it  takes  time  to 
distribute  the  sensors  correctly.  Their  results  follow  a  curve  that  looks  approximately  like 


00 

n{c  +  oo) 


(3.37) 


where  n  is  the  number  of  intersecting  targets  and  c  is  a  constant. 


Figure  3.12:  fj{oo),n  =  2,  c  =  0.08 


Another  hardware  specific  question  is  the  incremental  improvement  in  quality  per  sensor  allo¬ 
cated  to  the  target.  Again,  any  model  can  be  used.  One  possibility  is  that  a  certain  critical  quantity 
of  sensors,  say  /r,  is  required  to  get  a  decent  track,  and  any  more  than  that  results  in  progressively 
smaller  increases.  This  could  be  modeled  by 


+  my 


(3.38) 


In  the  following  example  we  will  use  a  linear  model,  qjijrij)  =  rrij,  to  exaggerate  the  benefits  of 
negotiation. 

Also  there  are  a  variety  of  functions  one  could  use  for  the  degradation  factor  A.  The  simplest 
function  would  be  a  constant  A  =  ^  where  a  measurement  must  be  taken  every  tc  seconds  or 
quality  will  be  lost.  Other  reasonable  functions  could  depend  on  target  variables  such  as  velocity 
and  size. 
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3.5.4  A  Two  Target  Example 

The  distance  between  the  two  targets  is  given  hy  D  =  HaTi  —  Since  there  are  only  two  targets 
we  can  drop  the  subscripts  and  refer  to  the  area  of  intersection  as  .  It  is  easily  shown  that 


A\D) 


2R^cos-\^)-^^^-l,  D<2R, 
0,  D>  2R. 


(3.39) 


The  targets  will  follow  trajectories  Tj  =  (t,  1)  and  X2  =  (f,  1  +  2R{1  —  for  0  <  t  <  T,  using 
a  sensing  radius  R  =  1. 


Figure  3.13:  Xi,X2,T  =  1000 

We  define  the  normalized  area  of  intersection  to  be  So  over  time  T,  0  <  <  1. 

If  we  allow  T  to  be  very  large,  A ^  will  increase  slowly  from  0  to  1 

Note  that  in  the  two  target  case,  equation  (3.36)  is  the  same  for  each  target,  so  we  will  be 
dropping  the  subscripts  entirely  .  Then  m(t,  u)  =  S{ttR'^  —  (1  —  f{uj))A\t))  so  q{m)  is  a  function 
of  t  and  to.  After  the  above  assumptions  and  definitions  are  incorporated  we  have 

Q  =  q{t,u)  -  \Q{t).  (3.40) 

Solving  for  Q  yields 

Q{t,u)  =  f  e^^q{s,u){l  —  u)ds.  (3.41) 

Jo 

For  sufficiently  large  T,  (3.41)  should  reach  equilibrium  values  over  the  range  of  values  for 
Aat.  In  order  to  prove  this  we  would  have  to  introduce  a  slowly  varying  parameter  et  and  find  the 
solution  to  (3.41)  as  e  — 0.  An  optimal  value  for  oj  can  then  be  determined  in  the  limit  ast^oo. 
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Chapter  4 

Praxeic  Decision  Theory:  Single  and 
Multiple  Agents,  and  Examples 

4.1  Overview  to  Praxeic  Decision  Theory 

Praxeic  utility  theory  is  an  approach  to  decision  making  and  control  that  provides  locality  of  deci¬ 
sions  and  avoids  over  proscription  by  providing  set-valued  solutions.  As  opposed  to  conventional 
optimization  approaches  to  decision  making  and  control  (approaches  leading  to  Bayes  decisions, 
optimal  control,  optimal  filtering,  and  a  host  of  other  successful  techniques),  the  praxeic  viewpoint 
is  to  weigh  each  alternative  in  the  decision  space  on  the  basis  of  its  own  merits,  retaining  as  candi¬ 
date  choices  all  those  whose  utility  toward  approaching  a  decision  goal  exceeds  the  weighted  cost 
of  the  choice.  By  considering  choices  on  the  basis  of  individual  merit,  an  optimal  choice  is  not 
deliberately  sought,  but  candidate  choices  can  be  regarded  as  being  good  enough  for  the  solution. 
Thus,  praxeic  utility  theory  provides  a  generally  constructive  way  of  approaching  problems  while 
breaking  out  of  the  “grip  of  optimality.” 

The  basic  framework  for  praxeic  utility  theory  builds  upon  two  functions  satisfying  the  axioms 
of  probability,  which  respectively  measure  the  utility  of  a  decision  with  respect  to  moving  toward 
some  desired  goal,  and  the  cost  associated  with  that  decision.  These  functions  are  called  the 
selectability  and  the  rejectability,  and  are  usually  denoted  psiu)  and  Pr{u),  where  the  argument  u 
represents  a  choice  under  consideration.  Out  of  a  set  of  possible  decisions  U,  the  praxeic  decision 
theory  indicates  that  retaining  all  those  choices  u  for  which  ps{u)  >  bpR{u)  are  satisficing.  In  this 
expression,  the  parameter  b  represents  a  decision  maker’s  “boldness”  in  rejecting  options  in  the 
interest  of  being  more  selective.  Lowering  the  boldness  results  in  retaining  more  decisions  (being 
less  decisive). 

While  a  straightforward  concept,  praxeic  decision  theory  has  proven  successful  at  a  variety 
of  problems,  some  of  which  have  eluded  prior  solution.  As  an  example,  the  inverted  pendulum 
problem  —  the  problem  of  balancing  a  broomstick  on  your  palm  —  has  been  solved,  even  for  the 
case  when  the  pendulum  (broom)  is  hanging  down.  In  this  case,  the  control  problem  is  a  nonlinear 
problem  not  amenable  to  any  standard  method. 

Notwithstanding  its  potential  for  single-agent  decision  making,  one  of  the  real  attractions  of 
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praxeic  utility  theory  is  that  it  can  be  rationally  extended  to  group  decision  making  problems. 
In  this  realm,  it  provides  a  useful  new  perspective  to  contrast  with  prior  techniques  where  each 
agent  acts  essentially  in  a  substantively  rational  way,  maximizing  its  own  utility,  effectively  shut¬ 
ting  out  solutions  which  might  be  useful  when  viewed  from  the  group  perspective.  The  ex¬ 
tension  to  group  decision  making  arises  by  defining  a  joint  selectability-rejectability  function 
Ps,r(mi,  M2,  . . . ,  Mat,  Vi,  V2,.  . . ,  vn),  with  argument  slots  for  the  selectability  and  rejectability  of 
each  of  the  N  agents  coordinating  in  the  system. 

The  joint  selectability-rejectability  function  incorporates  knowledge  about  how  others’  choices 
may  affect  the  selectability  and  rejectability  of  an  individual.  Since  by  design  the  selectability  and 
rejectability  functions  have  the  properties  of  probabilities,  factorization  of  the  joint  function  into 
units  which  identify  localized  behaviors  is  possible.  Thus,  it  is  possible  to  encode,  by  means  of 
conditional  probabilities,  if-then  type  statements:  If  an  agent  Y  does  “this,”  then  the  response  of 
agent  X  should  be  “that.” 

By  representing  explicitly  the  influence  that  other  agents  may  have  on  their  own  decisions,  the 
joint  selectability  rejectability  function  provides  immediately  the  means  for  coordinated  decision 
making.  When  the  group  wants  to  act  collectively,  a  joint  selectability  and  a  joint  rejectability  are 
computed  (using  the  properties  of  probabilities)  by 

Ps(u)  =  ^Psr(u,v) 

V 


and 

Pr(v)  =  ^Psr(u,v) 

U 

where  the  sum  is  over  the  cross  section  of  choices.  Using  these  joint  functions,  the  set  of  group 
decisions  is  selected  for  which 

Ps(u)  >  6pr(u). 

Another  approach  is  for  agents  to  make  individual  decisions,  which  still  recognize  and  account  for 
the  preferences  of  the  other  agents.  To  do  this,  an  agent  can  compute  its  own  marginal  selectability 
and  rejectability  functions  by 

PSiiUi)  =  ^Ps{Ui,u) 
u 

PRiiUi)  = 

u 

where  the  sums  are  over  all  other  agents’  decision  spaces.  Then  the  single-agent  decision,  account¬ 
ing  for  others,  is  to  retain  all  choices  Ui  for  which 

PsA'^i)  >  bpnAui) 

This  distinction  between  a  group  decision  —  based  on  joint  selectability  and  rejectability  — 
and  an  individual  decision  —  based  on  marginal  selectability  and  rejectability  —  leads  to  a  defi¬ 
nition  for  negotiation.  It  may  be  recognized  that  under  the  constraint  of  approaching  a  generally 
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accepted  collective  decision,  negotiators  are  usually  more  concerned  with  meeting  minimum  per¬ 
formance  requirements  than  with  meeting  maximum  performance,  that  negotiations  should  lead  to 
decisions  that  are  both  good  enough  for  the  group  as  a  whole  and  also  good  enough  for  each  indi¬ 
vidual,  and  that  an  element  of  search  is  typically  necessary  as  negotiators  work  toward  solutions 
(Stirling,  p.  163).  In  light  of  these  observations,  discrepancies  initially  evident  between  individuals 
based  on  their  marginal  selectability  and  rejectability  and  the  group,  based  on  joint  selectability  and 
rejectability,  which  represent  a  mismatch  between  individual  and  collective  choice,  can  be  worked 
out  by  a  general  lowering  of  expectations.  This  can  be  accomplished  algorithmically  by  lowering 
of  boldness,  until  individual  and  group  preferences  share  a  nonempty  set  of  solutions. 

Multi- agent  praxeic  utility  also  offers  other  alternatives  to  group  decision  making.  For  example, 
it  is  possible  to  model  a  deliberative  process:  agents  making  choices  based  on  the  choices  they 
think  other  agents  might  be  making.  Explicit  knowledge  models  can  be  incorporated,  with  one 
agent  modeling  the  expected  behavior  of  other  agents  in  formulating  its  decision  functions.  Several 
useful  models  of  coordination  and  cooperation  are  thus  possible. 

As  a  research  group,  we  have  invested  a  lot  of  time  exploring  the  basic  principles  of  praxeic 
utility.  In  doing  this,  several  example  applications  have  been  reviewed,  among  them  the  linear 
quadratic  regulator  problem,  and  the  inverted  pendulum  problem  (single  user  problems);  an  exten¬ 
sion  of  the  laissezfaire  problem  (a  resource  allocation  problem);  the  problem  of  assigning  pilots 
to  planes  with  constraints  regarding  skill  level  and  pilot  satisfaction  (negotiated  multi  agent  deci¬ 
sions);  “capture  the  flag,”  a  dynamic  game  of  pursuit  (a  coordinated  decision  problem);  and  the 
prisoners’  dilemma,  a  famous  example  of  a  difficult  two-person  game  from  game  theory  (an  ex¬ 
ample  of  a  deliberative  solution).  The  intent  for  working  through  these  examples  was  to  provide  a 
background  for  problems  of  immediate  interest,  including  the  “screaming  generals”  problem  that 
has  been  presented  as  a  model  for  task  assignment. 

Continued  investigation  of  the  general  praxeic  utility  theory  may  lead  to  interesting  research 
questions.  For  example,  some  problems  appear  to  admit  only  a  coordinated  solution  (each  agent 
taking  into  account  the  presence  and  actions  of  other  agents,  but  without  the  explicit  requirement 
that  all  solutions  be  collectively  acceptable.)  However,  as  the  constraints  on  the  problem  increase, 
it  may  be  necessary  to  enforce  group  preferences.  The  behavior  in  the  transition  region  between 
coordinated  and  negotiated  decisions  seems  like  an  interesting  question,  since  it  appears  to  be  on 
the  boundary  of  where  solutions  become  hardest  to  find.  Another  related  question  is  to  deter¬ 
mine  regimes  where  individual  coordinated  solutions  are  superior  (in  some  measure)  to  negotiated 
solutions. 

There  was  another  aspect  of  discussion  in  the  research  group,  motivated  in  part  by  issues  related 
to  praxeic  utility.  Formulating  the  joint  selectability  rejectability  function  can  be  computationally 
difficult  task,  especially  if  the  decision  spaces  and  the  number  of  users  are  large.  One  of  the 
alternatives  we  explored  is  the  concept  of  economic  equilibrium,  where  the  “negotiated”  solution 
depends  finally  on  a  price  vector  which  establishes  a  collective  equilibrium.  One  of  the  intended 
directions  for  ongoing  research  is  the  possibility  of  using  such  economic  models  to  formulate  the 
selectability  and  rejectability  functions,  bringing  to  bear  the  richness  of  economic  modeling  with 
the  flexibility  of  praxeic  utility  theory. 
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4.2  An  in-depth  look  at  praxeic  decision  theory 

The  following  discussion  is  derived  from  [17]  and  [18]. 

Traditional  engineering  design  methodologies  are  caught  in  “the  grip  of  optimality”:  Virtually 
every  engineering  design  concept  can  be  expressed  as  X  =  0.  For  example,  optimal  estimation, 
optimal  filtering,  optimal  detection,  optimal  equalization,  and  optimal  control  all  are  based  on 
principles  of  optimality.  Why  is  this?  Certainly  there  is  historical  justification  for  optimality -based 
methods,  and  often  analytically  tractable  solutions  are  obtained.  Looking  from  a  more  rational 
point  of  view,  however,  begin  “good  enough”  should  suffice  in  many  circumstances,  while  being 
“best”  is  a  bonus.  However,  there  has  has  been  no  systematic  development  of  the  “good  enough,” 
while  calculus  provides  us  a  measure  of  optimality. 

4.2.1  Modes  of  rationality 

In  contemplating  these  design  methodologies,  we  are  moved  to  identify  three  different  modes  of 
rationality, 

•  Maximize  expected  utility:  substantively  rational  (the  best) 

•  Follow  rules  or  procedures:  procedurally  rational  (what  we  can  come  up  with;  heuristic; 
ad  hoc) 

•  Expected  gains  exceed  expected  losses:  intrinsically  rational. 

Regarding  this  trichotomy,  the  following  observation  has  been  made: 

...  the  real  accomplishment  will  come  in  finding  an  interesting  middle  ground  between 
hyperrational  behavior  and  too  much  dependence  on  ad  hoc  notions  of  similarity  and 
strategic  expectations.  When  and  if  such  a  middle  ground  is  found,  then  we  may  have 
useful  theories  with  situations  in  which  the  rules  are  somewhat  ambiguous.  (Kreps, 

1990,  p.  184) 

A  mode  of  rationality  is  desired  that  addresses 

•  Adequacy  -  When  is  a  solution  “good  enough”? 

•  Sociality  -  How  can  a  group  rationality  be  defined?  (“Liberation  from  maximization  may 
open  the  door  to  accommodating  group  as  well  as  individual  interests.”) 

•  Intrinsic  -  the  comparisons  for  selecting  an  object  depend  on  the  object  itself  (not  on  other 
objects): 

•  Self-Criticism  -  How  can  the  quality  of  the  solution  be  gauged? 

Regarded  from  the  point  of  view  of  information  gathering,  as  a  precursor  to  a  more  active 
control  stance,  we  have  the  following  observation: 
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Minimizing  the  probability  of  error  is  not  equivalent  to  avoiding  error.  Indeed,  if  an 
expressed  aim  of  our  inquiry  is  to  avoid  error,  we  may  comply  completely  by  simply 
refusing  to  make  any  choice  at  all...  Evidently,  the  decision  maker  must  be  willing  to 
incur  some  risk  of  error  if  a  meaningful  decision  is  to  be  made.  (Stirling,  2000) 

As  an  example  in  a  decision-theoretic  context,  consider  a  digital  communication  system  in  which 
the  system  output  is  the  set-based  answer:  the  bit  is  either  1  or  0.  This  (useless)  system  is  guaran¬ 
teed  to  have  a  probability  of  error  which  is  always  zero:  the  receiver  is  never  incorrect.  However, 
these  decisions  are  useless:  the  ultra-cautionary  stance  has  lead  to  a  system  that  is  always  correct, 
and  simultaneously  never  informative. 

We  thus  conceive  of  two  desiderata  for  a  decision-making  agent:  the  desire  to  obtain  new 
information  (knowing  the  truth)  and  avoiding  error. 

These  observations  are  echoed  by  the  statement  of  the  logician  Whitehead: 

It  is  more  important  that  a  proposition  be  interesting  than  that  it  be  true.  This  statement 
is  almost  a  tautology.  For  the  energy  of  operation  of  a  proposition  is  an  occasion  of 
experience  is  its  interest,  and  its  importance.  But  of  course,  a  true  proposition  is  more 
apt  to  be  interesting  than  a  false  one.  (Whitehead,  1937). 

4.2.2  Truth  and  error  valuations 

One  of  the  tenets  of  praxeology  is  that  set  decisions  are  permissible.  We  have  a  set 
of  propositions  (choices)  U  available  to  a  decision  making  agent  X,  we  contemplate  the 
Boolean  algebra  of  possible  choices. 

In  the  interest  of  obtaining  new  information  (without  necessary  regard  for  its  veracity  or  use¬ 
fulness),  X  imposes  a  valuation  on  the  choices  available  to  it.  If  a  proposition  is  of  low  interest  — 
it  is  uninformative  —  then  there  is  a  high  value  in  rejection.  We  quantify  this  by  introducing  Pr  - 
the  informational  value  of  rejection. 


•  -Pr(0)  =  0  (no  value  in  rejecting  nothing).  Pr{U)  =  1  (normalized) 

•  Additive  structure:  Pr{Ai  U  A2)  =  Pr{Ai)  +  Pr{A2)  if  AiD  A2  =  0. 


For  example,  by  rejecting  G  P,  we  “conserve”  the  informational  value  Pr{{ui}). 

Under  the  assumption  that  only  one  choice  in  U  is  “correct”  (true),  then  the  utility  of  accepting 
a  set  A  G  JF  is 


Ia{u) 


1  M  G  A 
0  u  ^  A 


We  can  regard  Ia{u)  as  the  error  avoidance  utility:  the  utility  of  not  rejecting  the  set  A  if  m  is 
true. 
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If  an  agent  X  wants  only  information  value,  he  should  go  with  Pr,  if  he  wants  only  truth,  he 
should  go  with  I  a-  In  the  epistemological  framework  of  Levi  [19]  X  employs  a  convex  sum  of 
utilities: 

=  alAiu)  +  (1  -  a)(l  -  Pr{A)), 

for  0  <  a  <  1.  4){A,u)  is  the  epistemic  utility  function,  measuring  the  epistemic  value  of  not 
rejecting  A  when  u  is  true.  If  u  ^  A,  (j){A,u)  =  (1  —  q;)(1  —  Pr{A)).  If  u  &  A,  (j){A,u)  = 
a  +  (1  —  Q;)(l  —  Pr{A)). 

By  a  change  of  variables  (which  does  not  change  the  utility  ordering),  we  can  write 

<f{A,u)  =  Ia{u)  -  bPR{A) 

(where  b  =  {1  —  a)  / a),  which  is  an  equivalent  utility  function.  We  shall  call  b  the  index  of 
boldness. 


4.2.3  Expected  utility 


X  does  not  know  which  element  is  true,  and  so  cannot  evaluate  (p{A,u)  exactly.  However,  we 
assume  X  has  a  probability  distribution  Pr,  the  credal  probability  to  measure  the  degree  to  which 
it  believes  the  propositions. 

On  the  basis  of  this  probability  distribution,  we  form  an  average: 

Tp{A)  =  [  (p{A,u)Ps{du)  =  Ps{A)  -  bPR{A). 

Ju 

This  produces  the  expected  epistemic  utility.  For  discrete  outcomes: 


For  continuous  outcomes: 


=  Ps{u)  Pr{{u})  =  Pr{u) 
“  bpR{u)] 

u&A 


Ps{A)  =  /  Ps{u)du  Pr{A)  =  /  pR{u)du 


ip{A)  =  /  [p5(m)  -  bpR{u)]  du 

Ja 

Based  on  this  utility,  we  formulate  a  set- valued  decision  which  maximizes  the  expected  utility: 


Sb  =  argmax99(yl)  =  {u  e  U:ps{u)  >  bpR{u)} 

This  is  the  set  of  decisions  for  which  the  truth  support  is  greater  than  or  equal  to  the  informational 
value  of  rejection.  (Any  element  in  this  set  is  an  acceptable  answer  on  the  basis  of  the  criteria 
given.) 

Under  this  decision  criterion,  there  is  no  compulsion  to  accept  only  the  “best”  —  nor  even  a 
designation  as  to  what  best  is.  There  are  only  those  choices  which  —  based  on  intrinsic  valuations 
—  are  worth  the  risk  of  accepting.  Elements  not  chosen  (that  is,  those  in  U\Sb)  are  either  not  likely 
to  be  true,  or  are  not  worth  the  risk  of  choosing  even  if  true. 
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4.2.4  Decisions  ^  Actions 

Discussion  to  this  point  has  taken  the  view  point  of  a  decision  theory,  and  may  be  regarded  as  a 
discussion  in  the  branch  of  philosophy  known  as  epistemology: 

Epistemology:  the  study  or  theory  of  the  origin,  nature,  methods,  and  limits  of  knowl¬ 
edge.  (What  to  believe.) 

We  now  shift  our  viewpoint  from  one  of  decision  theory  to  one  of  control.  In  making  this  tran¬ 
sition,  we  make  use  of  the  term  praxeology,  introduced  by  Stirling,  as  a  “control”  parallel  to 
epistemology: 

Praxeology:  the  study  of  theory  of  practical  activity;  the  science  of  efficient  action. 

(How  to  act.) 

In  making  this  transition,  we  map  the  notions  of  “truth”  and  “error”  to  concepts  applicable  to  a 
domain  of  action:  We  want  more  than  “success,”  we  want  “efficient”  success.  An  agent  contem¬ 
plating  action  employs  the  rejectability  and  selectability  of  his  options,  defined  as  follows: 

•  Rejectability:  Options  are  evaluated  with  respect  to  the  degree  of  resource  consumption. 

•  Selectability:  Options  are  evaluated  with  respect  to  the  degree  that  it  accomplishes  the 
objective. 

As  in  the  decision-theoretic  case,  the  agent  assigns  a  selectability  measure  ps{u)  and  a  rejectability 
measure  Pr{u).  Proceeding  along  parallel  lines,  we  arrive  at  the  maximum  praxeic  utility  decision 
rule, 

Sb  =  {u  eU:  ps{U)  >  bpR{u)} 

Under  this  rule,  options  are  selected  for  which  the  selectality  is  not  less  than  the  rejectability.  We 
designate  this  test  as  the  PLRT  (praxeic  utility  likelihood  ratio  test). 

4.2.5  Tie  breaking 

By  the  stated  criteria,  any  element  in  Sb  is  acceptable.  However,  when  it  is  necessary  to  reduce  the 
choices  to  a  single  one,  one  of  several  tie  breakers  may  be  used: 

•  A  satisficing  option  ur  is  most  selectable  if  ur  =  arg  maXu(zR^{pR{u)}. 

•  A  satisficing  option  ur  is  least  rejectable  if  ur  =  arg  mmu(zRf,{pR{u)}. 

•  A  satisficing  option  u*  is  maximally  discriminating  if  u*  =  arg  maxu^Rf,{ps{u)  —  bpR{u)} 

A  satisficing  option  ui  is  more  satisficing  than  an  alternative  U2  if  ui  is  either  (a)  not  less  selectable 
and  less  rejectable  than  U2  [i.e.,  Pr{ui)  >  Pr{u2)  and  Pr{ui)  <  Pr{u2)]  or  (b)  not  more  rejectable 
than  U2  and  more  selectable  than  U2  [i.e.,  Pr{ui)  >  pr{u2)  and  pr{ui)  <  Pr{u2)] 

A  satisficing  option  is  arbitrary  if  it  is  chosen  at  random  from  Sb- 
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4.2.6  An  example:  Nonlinear  quadratic  regulator 

To  demonstrate  the  applicability  of  the  praxeic  concepts,  we  sketch  here  the  application  to  the 
nonlinear  quadratic  regular  from  controls.  [18].  Suppose  we  have  a  nonlinear  time-varying  system 
described  by  the  discrete-time  dynamics 

x(t  -f  1)  =  f{-x(t),u(t),t],  t  =  0,1, ...  ,tf  -  1 

with  a  performance  index 

i/-l 

J  =  ^  x^(t  -f  l)Q{t  +  l)x(f  -f  1)  -f  Ru{t)u^{t) 

t=0 

We  want  to  determine  an  input  sequence  u{t),t  =  0,1, ...  ,tf  —  1  to  minimize  J:  move  state  to 
origin  and  minimize  costs  along  the  way. 

We  will  use  proximity  to  final  goal  x^(t/)Px(t/)  to  determine  selectability,  and  incremental 
costs  x^(t  -f  l)(5x(f  + 1)  +  RuV^{t)  to  determine  rejectability:  actions  which  move  toward  the  goal 
will  rate  with  high  selectability,  while  actions  which  require  expensive  effort  are  more  rejectable. 
In  the  interest  of  implementability,  we  will  use  a  receding  horizon  controller,  choosing  at  each 
time  step  the  input  u{t)  that  is  locally  the  best.  Also,  assume  control  over  a  bounded  interval, 

^  (^min)  f^max)- 

To  define  selectability,  we  proceed  as  follows.  Pretend  that  the  next  step  is  the  final  one.  Define 

$(m)  =  x'^(t  -f  l)Px(f  -I-  1). 

Smaller  distance  is  better,  so  we  flip  this  around: 

gs{u-,'x.{t))  =  sup  <I>(u)  —  <I>(u) -f  e 

P  £  (  f^mi  n  J  ax  ) 

Now  normalize  so  it  looks  like  a  probability: 

Ps[u)  =  —tt - 

J^--gsMt))dv 

^  L/min 

So:  inputs  which  move  us  closer  to  target  have  higher  selectability. 

Rejectability  is  based  on  incremental  costs.  Define 

A(m)  =  +  l)Q{t  +  l)x'^(t  +  1)  +  Ru{t  +  l)u^{t). 


Smaller  cost  is  less  rejectable:  oriented  correctly.  Shift: 


gniu,  x(t))  =  A(m)  -  inf  A(t;)  -f  e 

P  ^  ( t^min  ?  U max  ) 


and  normalize: 


PR{u,^{t))  = 


9R{u,^{t)) 
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The  point  of  this  demonstration  at  juneture  is  this:  to  obtain  the  seleetability  and  rejeetability 
measures,  Simply  identify  the  goals  and  eosts,  and  quantify  them,  normalizing  as  probabilities. 

Now  let  us  eonsider  speeialization  to  a  linear  quadratie  regular  of  a  linear  system,  so  that 
eomparisons  may  be  made  with  eonventional  optimal  eontrol  theory.  Our  system  has  the  model 

Xfc+i  =  Axfc  +  k  =  0,1,2, ...  ,tf. 

The  eontrol  goal  is  to  seleet  a  eontrol  input  sequenee  {mq,  ui, ... ,  Ut-i}  to  drive  the  state  x^  from 
arbitrary  initial  eonditions  to  x*  =  0  in  a  way  to  eonserve  energy  in  the  eontrol.  There  are  two 
roughly  opposing  desiderata:  Reduee  the  error  at  the  terminal  time  ei  =  x^xt^,  and  to  use  as  little 

energy  as  possible,  62  =  conventional  optimal  control,  these  are  combined  into  a 

weighted  sum: 

J (mo,  •  •  • ,  Utf-i)  =  xf^xt^  +  Rul. 

k=0 

Linear  optimal  control  provides  a  well-known  answer.  By  classical  optimal  control,  Uk  = 
where  Kk  is  the  Kalman  gain,  Kk  =  [B^Pk+iB  +  R]~^ B^ Pk+iA,  with  Pk  =  A'^Pk+i{A  —  BKk). 
Regarding  this  optimal  solution  we  may  make  the  following  observations: 

•  The  “optimal”  may  not  really  be  needed  (it  is  simply  suggested  as  a  way  to  formulate  the 
problem). 

•  Choosing  a  single  performance  index  J  is  arbitrary,  as  is  the  weighting  between  the  desider¬ 
ata  measures. 

•  There  may  be  different  units  in  ei  and  62;  in  some  sense  they  are  incommensurables. 

In  the  praxeic  approach  to  this  problem,  we  do  not  seek  a  global  optimum  over  the  entire  en¬ 
semble.  We  consider,  therefore,  a  receding  horizon  of  d  control  inputs  {uk, ... ,  Uk+d-i}  computed 
as  a  function  of  state  x^.  The  control  Uk  is  implemented,  moving  to  state  x^+i,  and  the  process  is 
repeated.  We  will  deal  with  d  =  1  (one-step  look-ahead). 

We  begin  by  imposing  upper  and  lower  bounds  on  the  control  variable:  11^  =  (— ^m)- 
The  seleetability,  associated  with  the  target  position  goal,  is  defined  as  follows:  If  /c  -I- 1  were 
the  terminal  time,  the  cost  associated  with  this  time  would  be  ^{u,Xk)  =  x^^^x^+i  =  [Axk  + 
Bu\'^[Axk  +  Bu].  (If  this  were  our  only  constraint,  we  could  move  immediately  to  the  correct  final 
value).  We  define  the  function 

gs{u,Xk)  =  sup  {<I>(n,Xfc)  -  <I>(M,Xfc)} 

V&Um 

We  want  control  values  that  make  $  small  to  have  high  selectivity.  (We  will  normalize  momentar¬ 
ily). 

The  rejeetability,  associated  with  control  costs,  is  simply  taken  to  be  proportional  to  power: 

gR{u)  =  Ru^. 
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We  can  restrict  the  range  to  an  region  (these  can  be  computed  because  they  are  quadratic  func¬ 
tions) 


=  arg  max  gs{u,  x^)  =  arg  max  gni^u) 


Then  we  define  the  equilibrium  set  m*  =  min(Mgg,  u*  =  max(Mgg,  U  =  [u^,u* 
Now  let 

I  N  9siU:^k)  I  N  9r{u) 

Vs{u\^k)  =  — r  Vr{u)  = 


where 


G5(xfe)  ^  Gk(x,) 

/u*  pu* 

gs{v]y.k)dv  ^^(xfc)  =  /  9r{v)  dv 


are  normalizing  terms. 

Most  selectable:  us  =  —[B'^B]~^B'^A-Kk.  Least  rejectable:  ur  =  0. 

Observe  thatp5(M,  x^)  is  concave  and  Pr{u)  is  convex.  Thus  all  possible  satisficing  equilibrium 
controls  are  obtained  as  convex  combinations  of  ur  and  us'.  u\  =  \ur  +  (1  —  A)^^. 

Most  discriminating  control  is  that  which  maximizes  ps{u',  ^k)  —  bpR{u)  with  respect  to  u.  This 
gives 

Uk  =  -[B'^B  +  h'RY^B'^A^k, 


where  h'  =  We  thus  have  Uk  =  —ICk^k,  where  ICk  =  [B^ B  +  b'R]~^B'^ A,  which  is 

state-feedback  (like  optimal),  but  not  linear  since  b'  depends  on  x^. 

An  example  of  this  control  law  is  as  follows: 


0.9974  0.0539 

-0.1078  1.1591 


0.0013 

0.0539 


R  =  0.05. 


Then  the  control  input  u[k]  is  shown  in  figure  4.1,  and  the  phase-plane  trajectories  are  shown  in 
figure  4.2.6,  where  the  solid  line  indicates  the  optimal  solution,  and  the  dashed  lines  indicate  the 
praxeic  control  for  6  =  1  and  b  =  1.6. 


4.3  Extending  praxeic  utility  to  multi-agent  systems 

As  before,  this  material  is  drawn  closely  from  [20] . 

We  have  seen  that  the  praxeic  viewpoint  eschews  the  imperative  to  select  only  the  best  solution, 
considering  in  addition  solutions  which  are  arguably  “good  enough.”  This  additional  flexibility 
is  important  when  dealing  with  multiagent  systems.  In  this  section  we  discuss  aspects  of  this 
multiagent  application. 

As  Stirling  has  observed, 

“Group  rationality  is  not  a  logical  consequence  of  rationality  based  on  individual  self 
interest.  Under  substantive  rationality,  where  maximization  of  satisfaction  is  the  oper¬ 
ative  norm,  group  behavior,  consisting  of  the  collection  of  individual  behaviors,  is  not 
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usually  optimized  by  optimizing  each  individual  behavior,  as  is  done  under  conven¬ 
tional  game  theory.  Unfortunately,  those  who  put  their  final  confidence  in  the  limited 
perspective  of  exclusive  self-interest  may  ultimately  function  disjunctively,  and  per¬ 
haps  illogically,  when  participating  in  collective  inferences.” 

Coordination  among  agents  is  a  neutral  concept:  Cooperation,  negotiation,  competition,  and  any 
other  form  of  behavior  short  of  complete  indifference  and  isolation  between  agents  will  involve 
some  form  of  coordination.  Any  decision  by  an  agent  that  uses  information  concerning  the  exis¬ 
tence,  decisions,  or  decision-making  strategies  of  any  other  agent  is  a  coordinated  decision. 

4.3.1  The  View  from  the  Praxeic  Utilitarian 

1.  My  understanding  (or  estimate)  of  your  selectability  and  rejectability  functions  may  affect 
my  own  selectability  and  rejectability. 

2.  Group  decisions  require  the  formulation  of  joint  selectability  and  rejectability. 

3.  Set  decisions  provide  more  flexibility  for  achieving  jointly  acceptable  solutions. 

4.  Boldness  becomes  a  tool  for  negotiation. 

In  the  praxeic  framework,  a  single  optimum  strategy  is  not  sought.  Instead,  each  agent  consid¬ 
ers  options  based  on  joint  selectability  and  rejectabilities,  where  a  selectability  for  an  agent  may 
incorporate  selectabilities  and/or  rejectabilities  of  other  agents.  These  utility  functions  have  the 
properties  of  probabilities  and  may  frequently  be  codified  using  conditional  probabilities,  which 
essentially  provides  an  interpolated  rule-set  for  the  agents.  This  framework  can  encompass  tradi¬ 
tional  game  theory,  but  provides  in  addition  much  more  flexibility  and  a  closer  approximation  to 
the  operations  of  human  deliberations. 

4.3.2  Notation 

In  the  problem  formulation  each  agent  Xi,  X2, . . . ,  Xtv,  is  endowed  with  its  own  decision  space 
Ui,  and  joint  decision  space  is  formed  as  the  cross  product  of  these  decision  spaces.  Ui-,n  =  Ui  x 
■  ■  ■  xUn-  There  is  also  a  joint  Boolean  algebra  over  these  decision  spaces,  iFi.N  =  x  ■  ■  ■  x 
The  joint  rejectability  function  mi-,N  ■  ^  [0, 1]  maps  a  joint  decision  to  a  rejectability 

value.  The  joint  selectability  function:  Qi-.n  ■  iFi:N  — ^  [0, 1]  maps  a  joint  decision  to  a  selectability 
value. 

Each  agent  has  its  own  boldness:  bi,  and  a  joint  boldness  is  computed  as  a  product:  bi.N  = 
6162  ■■■bN. 

The  selectabilities  and  rejectabilities  of  all  agents  are  represented  in  a  coordination  function:  A 
joint  inter-agent  rejectability  and  credibility  function: 

/j?iS’i-R2S'2...-Rjv*S'jv  lU  1^  1^  1  •••  iV  ) 
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for  {x^, . . .  ,x^)  G  Ui-,N  and  . . .  ,y^)  G  (This  illustrates  a  potential  serious  problem 

with  this  approach:  the  argument  space  is  very  large.) 

Once  a  joint  inter-agent  rejectability  and  credibility  function  is  computed,  the  selectability  and 
rejectability  for  individual  agents  can  be  computed  as  marginals  of  the  coordination  function. 

Frequently,  the  inter- agent  rejectability  and  credibility  function  is  obtained  by  conditioning; 
this  can  reduce  the  effective  dimensionality  of  the  problem.  For  example, 

fRi,Si,R2,S2  =  fRi,Si\R2,S2fR2,S2 

fRi,Si\R2,S2  represents  Xi’s  values  (expressed  as  rejectability)  and  beliefs  (expressed  as  selectabil¬ 
ity),  conditioned  upon  what  X2  values  and  believes.  Note  that  fR2,S2  =  fR2fs2- 

Further  conditioning:  fR^,Si\R2,S2  =  fsi\RuR2,S2fRi\R2,S2-  We  will  set  up  some  parameterized 
values  for  these  functions. 

4.3.3  An  illustrative  example:  The  prisoner’s  dilemma 

The  prisoner’s  dilemma  is  a  famous  problem  in  game  theory;  we  present  it  here  to  illustrate  how  the 
praxeic  concepts  can  be  extended  to  multiple  agent  systems.  In  this  game,  two  agents,  Xi  and  X2, 
have  been  charged  with  a  serious  crime,  arrested,  and  incarcerated  so  they  cannot  communicate. 
Prosecution  has  evidence  to  convict  of  a  lesser  offense.  To  get  at  least  one  conviction  on  the  more 
serious  crime,  the  prosecution  entices  each  prisoner  to  give  evidence  against  the  other  in  return 
for  dropping  charges.  If  both  confess,  each  receives  a  prison  term  of  intermediate  length  (for 
cooperation).  The  payoff  matrix  for  the  game  follows. 


Xi 

X2 

silent 

confesses 

silent 

2,2 

4,1 

confesses 

1,4 

3,3 

The  choices  in  this  game  will  be  denoted  by  Hq:  silence.  Hi:  confess.  (Denote  choice  by  0  or  1). 

Then  a  reasonable  assignment  to  a  factored  joint  function,  in  terms  of  the  rejectability,  might  be: 

•  //?i|R2, 52(010, 0)  =  1,  /r?i|K2,52(l|0, 0)  =  0:  Given  X2  values  rejecting  silence  (confession), 
rejecting  silence  is  preferred.  Given  X2  believes  in  silence,  it  would  be  in  the  interest  to  be 
silent.  By  this  conditioning,  X2  is  confused,  so  we  go  with  the  safe  option. 

•  //?i  1^2,52(0 1 1, 0)  =  1  -  e,  /Ki|ij2,s2(l|l,  0)  =  e:  X2  values  and  believes  in  silence.  1  -  e  is 
the  degree  to  which  Xi  wants  to  exploit  this  willingness. 

•  /i?i|i?2, 52(010, 1)  =  1  —  /i,  /iji|i?2,52(l|0, 1)  =  ^2  values  and  belies  in  confession,  p  is  the 

the  degree  to  which  Xi  is  willing  to  be  exploited  (martyrdom). 

•  //?i  1^2,52(0 1 1, 1)  =  1,  /iji  1/22,52(1 1 1, 1)  =  0:  again,  X2  is  confused,  and  we  go  with  the  safe 
option. 
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The  parameters  /i  and  e  represent  willingness  to  give  and  take,  but  it  is  not  neeessary  that  fx  =  1  —  e. 

The  seleetability  funetion  ean  be  represented  as 

•  /5i|iJi,/?2,s'2(0|0)  0)  0)  =  0’  /si|iJi,iJ2,S2(l|0)  0)  0)  =  1’  /s'i|Ki,fi2,5'2(0|l)  0)  0)  =  0^  fsi\Ri,R2,S2iM^:  0)  0) 

1:  X2  is  eonfused:  go  with  the  safe  bet. 

•  /si|iJi,K2,S2(0|0>0>  1)  =  /si|Ki,R2,S2(1|0,0,  1)  =  1  -  a:  X2  wants  to  eonfess.  a  measures 

Xi’s  belief  that  best  interest  lies  in  self  defense. 

•  /si|Ri,/?2,S2(0|1>0>1)  =  X, /si|iJi,rj2, 52(111, 0,1)  =  1-%:  x  measures  masoehism:  aeting 
against  values. 

•  /Si|iJi,K2, 52(010,  1,  1)  =  0,  /si|iJl,iJ2,52(l|0,  1,  1)  =  0,  /s'i|Ki,iJ2,52(0|l,  1,  1)  =  0,  /5l|i?l,K2,52(l|l,  1,  1) 

0:  X2  is  eonfused:  go  with  the  safe  bet. 

4.3.4  Agent  reasoning  and  deliberation 

In  addition  to  “statie”  representations  of  rejeetability  and  seleetability,  there  is  also  the  notion  of 
reaetion:  A  model  whieh  enables  Xi  to  prediet  X2’s  deeision  is  said  to  be  reactive.  A  reaetive 
model  is  illustrated  here: 


In  the  eontext  of  reaetive  response,  or  deliberation,  let  fR2S2\RiSi  be  the  estimate  Xi  has  of  X2’s 
eondition  eoordination  funetion,  and  let  be  Xi’s  a  priori  estimate  of  X2.  Xi  may  obtain  an 
initial  estimate  of  the  eoordination  funetion  as 

f[l]  _  f  f[0] 

JR1S1R2S2  ~  1 -5151 1^252  J_R252- 

Then  the  joint  rejeetability /seleetability  is 

r.2gf/2  s2^U2 

•  Xi  may  adopt  a  reaetive  deeision  now:  form 

fSsrir\s^) 

and 

rlet/l 

then  use  the  rule  of  epistemie  utility. 
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Or  Xi  may  adopt  a  deliberative  strategy:  Xi  may  impugn  to  X2  his  own  methods:  / 

f  R2 82] Ri Si  f Si  ’ 


R1S1R2S2 


fRlsiR2S2^'^^  ^  ^ 

sief/i  r^eUi 


An  agent  ean  “deliberate”  many  iterations. 

p[0]  p[l]  p[l] 

K2  rii  K2 


p[2]  pM  pN+1] 

-ZL^  -^2  "^1 


fN+i]  _  f  fM 

JR1S1R2S2  ~  J RiSi\R2S2J R2S2 


E  E 

s2ef/2  ?-2ef/2 

?[n+l]  _  n  fN+l] 

JR1S1R2S2  ~  J  R2S2\RiSiJ  R-i^Sl 


IR2S2  V'  -  E  E  /]^lSlr^252(^^  '5^  '5"’)- 

sis(7i  rig;7i 


/S(r^.^)  = 


The  deliberation  ean  be  expressed  in  terms  of  matriees,  and  the  eonvergent  solution  ean  be  found: 
the  eigenveetor  eorresponding  to  the  unit  eigenvalue  of  the  appropriate  matrix. 


Lemma  1  When  6  =  1,  cooperation  cannot  occur  unless  x  >  0  and  /r  >  0. 

For  example,  in  the  Prisoner’s  dilemma,  when  x  — ^  0,  we  find  the  rejeetability  and  seleetability  of 
the  fixed  point  to  be 


■  1  +  a  —  pa  ■ 

a  —  pa 

1  +  p  +  a  —  pa 

\  +  p  +  a  —  pa 

p 

Qi  = 

1  +  /i 

.1  +  p  +  a  —  pa. 

\  +  p  +  a  —  pa  _ 

So  if  there  is  any  hope  of  eooperation,  eaeh  prisoner  must  to  some  degree  value  martyrdom  (p  >  0) 
and  must  also  have  non-zero  seleetability  that  a  masochistie  deeision  is  in  its  best  interest  (x  >  0). 


Lemma  2  When  6  =  1  and  x,  /i  >  0,  silence  (Hq)  is  selected  when  x/i  >  e  +  e  —  ee. 
Note  that  this  does  not  depend  upon  c  or  a. 
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4.3.5  General  formulation  of  multiagent  epistemic/praxeic  decision  making 

In  the  general  case,  we  form  a  joint  selectability/rejectability  function 

PSi,S2,...,SN,Ri,R2,-,RNi'^l^  ^2,  -  ■  ■  ,  Un,  Ri,  R2,  -  ■  ■  ,  Rn)  =  Psr(u,  v), 

where  Ui  G  Ui  and  Vi  G  Ui.  Let  U  =  x  f/2  x  ■  ■  ■  x  f/Ar. 

As  we  have  seen,  determining  the  values  is  frequently  accomplished  by  factorizations.  For 
example  in  a  two-agent  system  we  might  have 

PSi52RiR2('*^1)  '^2,  Vl,V2)  =  PSi\S2RiR2i'^i\'^2,  Vl,  'i'2)PS2|RiR2  ("*^2  ^1,  "^2) 

PRi\R2{Vi\v2)PR2{v2) 

=  PSi\S2R2iUl\u2,V2)pS2\Riiu2\Vl)pR^\R^iVi\v2)pR2iv2) 

(selectability  does  not  depend  on  rejectability) 

We  form  the  multipartite  selectability  and  the  multipartite  rejectability  as  marginals: 

Ps(u)  =  Psr(u,  v)  Pr(v)  =  Y  Psr(u,  v) 

veu  ueu 

The  Multipartite  Decision  rule  forms  the  multipartite  satisficing  set  as 

Sf,  =  {u  G  U:  ps(u)  >  6pr(u)}. 

Individual  selectability  and  rejectabilities  are  computed  using  further  marginalization: 

PSii'^i)  =  Y  PRi{Vi)=  Y  PRi,-,RNi'^U---^'^N) 

Then  an  Individual  Satisficing  set  is  formed  as 

SI  =  {ui  G  Ui  :  psiiui)  >  bpRiiui)} 

Denoting  Rf,  =  Si  X  Si  X  ■■■  X  as  the  satisficing  rectangle  for  all  agents,  we  may  ask:  is 
Rfc  =  Sft?  Is  the  collection  of  individual  decisions  equivalent  to  the  multipartite  decision?  That  is, 
do  the  individual  decisions  coincide  with  the  group  decisions?  In  general,  the  answer  is  no.  In  this 
case,  there  is  need  for  negotiation. 

4.3.6  Negotiation 

We  may  make  the  following  observations  about  negotiation: 

•  Negotiators  are  usually  more  concerned  with  meeting  minimal  requirements  than  with  achiev¬ 
ing  maximum  performance. 

•  Negotiations  should  lead  to  decisions  that  are  good  enough  for  the  group  as  a  whole  and 
good  enough  for  each  individual. 
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•  Negotiation  is  a  narrowing  down  of  options. 

A  model  for  multiagent  decision  making  should  reasonably  support  a  method  of  negotiation  which 
supports  these  concepts.  As  we  now  show,  this  is  the  case, 

The  set  Sf,  (the  jointly  satisficing  set)  represents  choices  that  are  good  enough  for  the  group. 
The  set  Rf,  (the  satisficing  rectangle)  represents  choices  that  are  good  enough  for  the  individuals 
in  the  group. 

The  Negotiation  Theorem  states:  It  can  be  shown  that  if  Sj  is  individually  satisficing  for  Xj, 
that  is  Si  G  SI,  then  it  must  be  a  corresponding  element  in  some  jointly  satisficing  vector  s  G  S;,. 
By  this  theorem,  no  one  is  “frozen  out”  of  a  deal. 

In  the  context  of  satisficing,  a  means  of  representing  the  “lowering  of  standards”  for  group 
accommodation  is  the  boldness. 

Let  bi  be  the  boldness  for  Xi,  b  =  (6i, . . . ,  and  bi  =  min{6i, . . . ,  bn}. 

Now  form  a  compromise  set  of  choices  that  are  individually  satisficing: 

Cj  {s  (si,..., Sjv}  G  .  Sj  G  5*6^ ]■ 

(Uses  bi'.  the  standards  of  a  group  can  be  no  higher  than  the  standards  of  any  member  of  a  group.) 

A  choice  s  =  (si, . . . ,  sat)  is  a  satisficing  imputation  at  boldness  b  if  ps(s)  >  &lPr(s)  and 
PSii-Si)  >  bipnfisi)  for  i  =  1, 2, . . . ,  X:  it  is  jointly  satisficing  for  the  group,  and  each  component 
is  individually  satisficing  for  its  corresponding  member  of  the  group.  The  satisficing  imputation 
set  Nb  is  the  set  of  satisficing  imputations: 

Nb  = 

A  method  of  negotiation  is  to  have  each  agent  lower  its  own  boldness  until  a  non-empty  Nb  is 
obtained. 

Based  on  this  concept,  we  present  the  following  algorithm  for  negotiation: 

Each  agent  chooses  a  boldness  bi  (typically  6*  =  1  to  start) 

1.  Xi  forms  and  SlXov  i  =  1,2  . . . ,  N. 

2.  Xi  forms  its  compromise  set  by  eliminating  all  choices  for  which  its  component  is  not  indi¬ 
vidually  satisficing.  This  gives  C*  =  {s  G  Si  G  S^J. 

3.  Xi  shares  Cj  and  bi  to  all  other  agents. 

4.  The  satisficing  imputation  set  Nb  =  is  formed.  If  Nb6/  =  0,  then  decrement  bj  for 

j  =  1, . . . ,  N,  and  return  to  step  I. 

5.  After  completion,  X*  implements  the  Ah  component  of  the  rational  compromise 

*  PSi,...,S'jv(s) 

s  =  arg  max - ^ . 
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4.3.7  A  simple  example  of  negotiation 

N  Pilots  Xi, . . . ,  Xat  to  collectively  fly  M  <  iV  aircraft  for  mission  k.  Let  I{k)  =  {ii, . . . ,  iju} 
denote  the  set  of  indices  of  participants,  I  <  ij  <  N.  Each  X*  has  a  skill  level  Si{k).  Let 
s{k)  =  {si(A;), . . . ,  SNik)}.  Let  cr{k)  =  ,  Sij^{k)},  ij  G  I{k)  be  the  skill  levels  of  the 

participants  on  mission  k.  Let  gi{s)  denote  the  flyer’s  satisfaction,  with  gi  nondecreasing  in  s,  and 
0  <  gi{s)  <  1.  Skill  increase  with  experience: 


Si{k  +  1) 


fi[g{(T{k),Si{k)]  iiiel{k) 
f2[si{k)  iii^I{k). 


Let  g[(T{k)]  denote  the  joint  satisfaction  function  for  the  group.  Each  mission  incurs  risk  to 
fliers:  we  assume  risk  function  depends  on  least-skilled  flyer  in  group  r(minjg/(fc)  Si(A;)).  The 
Individual  agents’  Goal:  to  increase  skill  level  (satisfaction).  The  Group  Goal:  all  participants  to 
increase  skill  levels,  perhaps  uniformly. 

Agents  must  negotiate  to  obtain  group  decision. 

Let  Ui  =  {1, 0},  indicating  fly  or  don' t  fly.  Group  decision:  U  =  {0, 1}^.  The  decision 
vector  (of  length  N)  must  have  exactly  M  Is  in  it;  there  are  (^)  possible  choices  in  this  set,  which 
we  designate  as  Uat. 

Lor  a  u  G  Uat,  we  can  write  the  skill  vector  cr{k)  as  o-{k)  =  <l>(u(A;))s(A;),  where  $(u)  maps 
the  vector  to  a  matrix,  as 


$(1100) 


10  0  0 
0  10  0 


We  also  have  g[(T{k)]  =  g[^{u{k))s{k)]. 

Selectability:  (Want  to  maximize  collective  skills) 


ps{u-,s{k)) 


g{<S>{u)s{k)) 

veujv 


U  G  Uat 

U  ^  Uat. 


Rejectability:  (Minimize  risk) 


Pr(u;s(A;)) 


r(<I>(u)s(A:)) 


Eveu„  minr(${v)s(fc)) 


OO 


U  G  Uat 

u  G  Utv- 


Skill  update:  Let /i[5(($(u(/c))s(A;)),  Si(A;)]  =  llsj(/c)  +  Q;(5f($(u(A;))s(A;))) As,  where  0  <  A  <  1 


^  '  1  ^-a{x-xo) 

Let  f2{si{k))  =  Asi{k)  (atrophy  without  use). 

Satisfaction:  Let^i(s)  =  . 

Risk:  Take  r(a;)  =  |  +  for  some  h  >  0. 

As  a  particular  example,  assume  X  =  4,  M  =  2. 
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The  group  selectability  can  be  factored  as 

PsiS2S3S4,{l^l,  '*^2,  U3,  U4)  =  Psi\S2S3S4{'^l\u2,  U3,  U4)Ps2\S3,S4,{'^2\u3,  U4) Ps3{u3) Ps^iu^) 

(E.g.,  assume  that  Xa’s  preferences  are  independent  of  X4’s  preferences.) 

Now  we  need  to  specify  all  the  conditional  selectabilities: 

PS3{1-,  s{k))  =  1  -  gsissik))  ^53(0;  s{k))  =  gsis^ik)), 


similarly  for  ps^. 

For  Ps2\S3,S4-  If  both  X3  and  X4  ascribe  their  entire  preference  to  flying,  X2  should  not  elect  to 
fly.  Otherwise,  X2  should  go  with  its  individual  preferences. 

PS'2|53,S'4(1|1)  1;s(^))  =  0  PS2|S3,S4(0|1)  =  1 

PS2|S3,S4(1|1>  0;  =  1  -  92{s2{k))  Ps2|S3,S4(0|1>  0;  ^(k))  =  g2{s2{k)), 

and  so  forth. 

For  Ps'i|52,s'3,S4-  If  two  conditioning  agents  place  their  entire  unit  of  preference  on  flying, 
then  Xi  should  not  elect  to  fly.  Otherwise,  Xi  should  go  with  its  myopic  preferences.  Putting  these 
all  together  we  obtain  the  group  preference  function: 

PsiS2S3S4{^,  1, 0, 0;  s{k))  =  (1  -  ^i(si(A;)))(l  -  g2{s2{k)))g3{s3{k))g4{s4{k)) 
Ps4S2S3S4i^:0,l,0;s{k))  =  {I  -  giisi{k)))g2is2ik)){l  -  g3is3ik)))g4is4ik)) 

PS1S2S3S4  (1, 0,  0, 1;  s(/c))  =  (1  -  gi{s4{k)))g2is2ik))g3{s3{k)){l  -  g4is4{k))) 
PsiS2S3S4i0:  1,  1,  0;  s(/c))  =  giisi{k)){l  -  ^2(s2(/c)))(l  -  g3is3ik)))g4is4ik)) 
PsiS2S3S4i0: 1,  0, 1;  s(^))  =  9iisiik)){l  -  g2is2ik)))g3{s3{k)){l  -  g4is4ik))) 

PsiS2S3S4{^,  0, 1, 1;  s{k))  =  g4{si{k))g2{s2{k)){l  -  g3{s3{k))){l  -  g4{s4{k))) 

After  all  this  (to  demonstrate  the  mechanics  of  the  mathematics),  the  results  can  be  summarized 
(after  simulation):  With  arbitrary  initial  conditions  for  skill  levels,  simulations  converge  to  equal 
skills. 


4.4  An  example  application:  Resource  Allocation 

A  system  of  N  agents  {Xi, . . . ,  X^}  are  to  be  assigned  to  do  a  number  of  tasks.  For  the  sake 
of  definiteness,  we  will  assume  three  distinct  classes  of  tasks  (from  which  generalization  should 
be  clear).  We  will  refer  to  the  tasks  as  flying,  sailing,  and  swabbing  —  tasks  that  require  vastly 
different  skill  sets.  Each  agent  is  endowed  with  a  skill  set  that  increases  with  use  on  a  particular 
task.  There  are  also  physical  resources  necessary  for  tasks,  although  not  enough  that  everyone 
can  be  assigned  to  the  resource  for  every  mission.  These  resources  are  planes,  boats,  and  mops. 
There  is  a  desire  among  agents  to  increase  skill  in  either  flying  or  sailing  (depending  upon  the 
classification  of  the  agent),  but  probably  no  desire  to  improve  skill  at  swabbing. 
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In  addition,  there  is  another  agent  Xq  representing  “the  management,”  whose  job  it  is  to  see 
that  the  mandatory  tasks  are  eompleted.  Management  is  only  incidentally  interested  in  seeing  to 
the  general  satisfaction  of  the  others. 

The  jobs  at  time  t  can  be  decomposed  into  a  class  of  “mandatory  jobs  and  a  class  of  “optional 
jobs.”  Mandatory  jobs  are  those  that,  from  the  perspective  of  management,  must  be  done  —  actual 
missions  to  accomplish  something.  The  optional  jobs  are  available  for  workers  (who,  for  example, 
might  wish  to  increase  their  skills),  but  are  not  first  priority.  Actual  missions  may  involve  higher 
risk  than  other  missions,  or  greater  discomfort.  Management  always  wants  to  see  the  mandatory 
jobs  done  first;  the  other  agents  might  prefer  the  optional  jobs  first  due  to  their  lower  risk.  (We  will 
assume  that  in  either  case  the  skill  level  increase  is  the  same  for  both  mandatory  or  optional  jobs; 
down  the  road  we  might  want  to  consider  varying  these,  to  build  in  the  fact  that  there  should  be 
better  rewards  for  riskier  duty.) 

We  denote  by  (t)  the  requested  number  of  tasks  to  be  performed  of  type  j  at  time  t,  and  write 
M^(t)  +  +  M^(t)  =  M(t).  Of  the  M^(t)  jobs  of  type  j,  we  will  denote  M\t)  <  M^(t) 

of  them  as  mandatory.  The  number  of  agents  actually  assigned  to  task  j  is  and  it  will 

always  be  the  case  that  M^{t)  <  (Can’t  fill  more  missions  than  there  is  room  for.)  We  take 

M{t)  =  M\t)  +  M^it)  + 

(In  a  more  complete  model,  aspects  of  the  mission  related  to  piloting  skills,  such  as  time  of 
day,  light  levels,  etc.,  should  be  incorporated.) 

The  decision  space  for  each  X*  is  the  set  {do  nothing,  fly,  sail,  swab},  which  we  represent  as 
Ui  =  {0, 1,  2,  3}.  The  group  decision  space  is  U  =  {0, 1, 2, 3}^.  The  decision  vector  u(t)  is  an 
integer  string  of  length  N,  such  that  (i//(u(t),0)  =  M{t).  (dn  is  the  Hamming  distance.)  Then 
[n{t)]i  is  the  task  assigned  to  agent  Xj.  Mflt)  is  a  function  of  the  decision  vector  at  time  t: 

Mflt)  =  M(u,t)  =  |{ui  =  j,i  =  1,2,...,X}|, 

where  |A|  denotes  the  number  of  elements  in  the  set  A. 

The  set  of  admissible  combinations  of  choices  U(t)  is  determined  by 

U(t)  =  {u  e  U(t):|M^(u)  <  =  1,2,3} 

This  allows  for  the  possibility  of  incomplete  missions.' 

4.4.1  Agent  descriptions 

1.  For  an  agent  Xj,  i  >  0,  let  its  skill  vector  at  time  t  be 

where  the  superscript  indicates  the  task,  1  =  flying,  2  =  sailing,  and  3  =  swabbing. 

'a  key  issue  to  be  developed  is  localization.  The  computational  complexity  of  this  problem  is  going  to  grow  very 
rapidly.  Can  the  overall  computations  be  partitioned  into  reasonable  subproblems? 

Another  key  aspect  that  should  be  examined  is  the  revision  of  schedules:  what  if  a  schedule  is  established,  and 
new  conditions  develop.  Can  a  means  be  found  to  modify  the  schedule  (with  minimum  impact  while  still  fi  nding  an 
effective  schedule). 
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2.  The  skill  matrix  across  all  agents  is 


S{t) 


Si(t) 

Sjv(t) 


3.  Let /^(t)  indicate  the  set  of  indices  of  participants  on  the  flying  mission, /^(t)  =  •  •  • , 

where  =  M^{t)  is  the  number  of  agents  on  the  first  mission  at  time  t.  Similarly  define 
P{t)  and  I^{t)  for  sailing  and  swabbing,  respectively.  Note  that  P  fl  P  =  0  for  i  j,  and 
that  M^(t)  +  +  M^(t)  =  M(t).  The  number  of  missions  of  each  type  may  vary  at 

each  time. 


4.  Let 


(T\t) 


denote  the  skill  vector  for  the  entire  team  that  is  flying  at  time  t,  and  similarly  define  skill 
vectors  for  sailing  and  swabbing  teams  with  cr^(t)  and 


5.  As  regards  measuring  satisfaction,  this  is  somewhat  more  complicated.  In  Wynn’s  original 
development,  the  satisfaction  is  a  monotonic  function  of  the  skill.  However,  an  agent  as¬ 
signed  to  a  task  in  which  he  is  uninterested  may  not  demonstrate  increased  satisfaction.  We 
therefore  assume  that  each  agent  has  a  specified  interest  area,  and  that  satisfaction  is  mea¬ 
sured  as  a  monotonic  function  of  skill  in  that  specific  area.  We  denote  the  specified  interest 
area  of  Xj  as  a*  G  {1, 2, 3}.  For  our  purposes  this  will  assumed  to  be  fixed,  but  more  gen¬ 
erally  might  change  over  time.  (For  example,  if  an  agent  finds  himself  highly  skilled  in  an 
area  different  from  his  specified  area,  he  might  determine  greater  satisfaction  by  switching.) 

Let  gi{t)  denote  the  satisfaction  of  X*  at  time  t.  Then 


9i(t)  = 


ifi  e  P(t),  j  ^  Oi, 

where  it  is  assumed  that  gi{x)  >  gi{x)  (greater  satisfaction  for  preferred  skills). 
Let  3^  denote  “interest  group”  j,  by 


(4.1) 


6.  Risk:  From  the  management’s  point  of  view,  there  may  be  risk  involved  in  an  open  process. 
In  particular,  in  an  “agent’s  market,”  an  agent  who  become  fed  up  with  the  way  of  doing 
business  may  simply  walk  away  from  the  table.  This  does  not  necessarily  coincide  with  a 
military  way  of  doing  business,  but  does  introduce  an  interesting  way  of  measuring  the  cost 
of  holding  out.  Specifically,  the  following  might  be  elements  of  risk: 

•  There  is  the  risk  that  agents  may  lose  interest  if  the  negotiations  proceed  for  too  long. 
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•  Risk  may  be  tied  to  perceived  discrepancies:  an  agent  perceiving  favoritism  directed 
toward  another  agent  may  become  disenchanted. 

7.  Each  mission  incurs  some  danger  or  risk  to  the  participants.  We  model  that  the  group  is  as 
vulnerable  as  its  least-skilled  member.  Denote  the  risk  for  the  group  performing  task  j  at 
time  t  by  r'^  (minjg/j(4)  sl{t)),  where  is  a  nondecreasing  function  of  its  argument.  This 
represents  the  fact  that  the  more  skilled  the  group,  the  less  susceptible  to  hazard. 

8.  As  a  result  of  participating  in  a  mission  in  area  j  at  time  t,  X/s  skills  increase  in  that  area  as 
a  function  of  the  success  of  the  group  and  the  participant’s  skill  level  at  that  time.  If  Xi  does 
not  participate  at  time  t,  its  skill  atrophies.  We  write 

j  ffi  (t)  >  4  (^)]  i  e  P  (t) 

'  [mm 

9.  The  individual  goal  of  each  agent  is  to  increase  its  skill  level  in  its  specified  area,  or  equiva¬ 
lently,  to  increase  its  satisfaction  in  its  specified  area. 

10.  The  group  goal  is  twofold.  First,  all  necessary  tasks  must  be  completed.  (That  is,  the  group 
identifies  with  administrative  requirements.)  Second,  as  a  group  all  participants  are  to  in¬ 
crease  their  skill  levels,  particularly  in  their  preferred  area. 

11.  Each  agent  has  some  input  on  whether  they  participate  on  a  mission.  This  is  weaker  than 
Wynn’s  model,  in  which  all  agents  must  agree  on  who  participates  on  each  mission.  (Need 
to  develop  this  issue  more  fully.) 

12.  “The  group  goal  is  primarily  for  all  participants  to  increase  their  skill  levels  uniformly,  that 
is,  for  them  to  achieve  some  form  of  skill  equilibrium.  Although  this  goal  is  generally 
consistent  with  the  individual  goals,  it  is  not  necessarily  served  by  having  each  of  the  agents 
pursue  their  individual  goals  separately.  There  must  be  some  principle  of  coordination  that 
will  supersede  the  individual  interests.  Eet 


=  9{sh{t),  . . . ,  E  =  1,2,  ...,M 

denote  the  joint  satisfaction  for  the  group.”  (Wynn) 

In  the  present  case,  the  group  goal  needs  to  be  reconsidered.  More  correctly,  it  is  not  desir¬ 
able  to  have  all  skill  levels  of  all  agents  approach  uniformity  in  all  areas,  since  it  is  expected 
that  better  skill  will  be  obtained  by  specialists  in  a  skill  area.  From  a  management  perspec¬ 
tive,  the  goal  is  to  have  enough  agents  skilled  in  each  area  to  be  able  to  effectively  meet 
mission  requirements,  and  to  attain  a  modicum  of  agent  satisfaction.  Eet 

9[9i{t)P  =  1,...,X] 

denote  the  group  satisfaction.  This  will  be  affected  by  the  individual  satisfaction  on  the  tasks 
which  the  agents  are  participating  in  (depending  on  their  preferences). 
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Let  us  take  the  selectability  measure  of  Xq  to  be  the  degree  to  which  the  mandatory  jobs 
are  filled,  and  the  rejectability  to  be  a  measure  of  the  un-skill  of  the  agents  involved.  Thus, 
Xq  wants  the  job  done,  and  prefers  that  it  be  done  only  by  skilled  agents.  Beyond  this, 
management  does  not  care  explicitly  about  the  comfort  of  the  other  agents.  (However,  there 
is  a  built-in  linkage:  since  better  skilled  agents  are  preferred,  this  should  ultimately  favor  the 
building  of  skills  in  the  agents,  and  so  works  in  accord  with  their  selectability  function.) 


We  will  form 


3 

So(u(t),t)  =  max(M'^  —  0) 

i=i 


This  measure  counts  as  a  penalty  a  failure  to  fill  the  mandatory  jobs,  but  does  not  incur 
any  penalty  for  having  more  than  this  filled.  Take 


So(u,t)=  max  So(u(t), f)  —  So(u(t), t) 
u{t)eu(i) 


as  Xo’s  unnormalized  selectability,  and  form 


Pso(u) 


go(u,t) 

V6U  «o(u,i) 


U  G  IX 

Otherwise 


Let  the  combined  risk  be  denoted  as 

ro(u)  =  pir^(  n:iin  4(f))  +P2r^(  min  ^(t))  +p3r3(  min  4(t)) 

*e/2(t)  »e/3(t) 

where  pi  are  weighting  factors  for  the  various  tasks,  with  pi  =  1.  For  example,  man¬ 
agement  may  care  less  about  the  risk  to  swabbers  than  they  do  about  the  risk  to  pilots.  The 
rejectability  function  is  formed  by  normalizing  r(u).  This  can  be  converted  into  a  re¬ 
jectability  (if  desired)  by  normalization: 


Pi?o(u) 


Io(u) 
veu  '^o{v) 


u  G  IX 
otherwise. 


The  individual  selectability  for  an  agent  Xj,  f  =  1,  2, . . . ,  iV  is  determined  by 


5i(u)  =  gi{t) 

where  gi{t)  is  defined  by  (4.1).  The  individual  rejectability  for  Xj  is 

L(y)  =  r^isiit)) 

where  r^{x)  =  0  (no  risk  in  doing  nothing). 
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4.4.2  Resource  and  job  descriptions 

At  time  t,  assume  that  there  are  resources  available  to  accomplish  the  tasks  (one  resource 
for  each  task).  It  may  occur  that  (more  tasks  than  there  are  resources). 

In  an  eventual  description,  there  may  be  a  history  associated  with  each  resource,  and  some 
criterion  for  resource  use.  For  example,  it  may  be  desirable  to  ensure  that  each  plane  has  approxi¬ 
mately  the  same  amount  of  air  time.  However,  we  will  not  worry  about  such  a  complication  at  this 
point. 

The  skill  vector  associated  with  task  j  is 

(T^{t)=  IJ 

Individual  selectability:  based  on  wanting  to  fly.  Individual  rejectability:  tied  to  length  of  time 
negotiations  take. 

Joint  selectability  for  Xi, . . . ,  Xj^:  Each  individual  wants  to  fly,  but  also  wants  the  group  as  a 
whole  to  succeed. 

Joint  selectability  for  Xq:  see  that  the  necessary  tasks  are  filled.  Keep  the  agents  as  happy  as 
possible. 

How  to  tie  these  together?  Let  I  =  [0, 1, ... ,  N],  I  =  [1, . . . ,  iV]  (excluding  the  first  manager 
index).  Then 

Psj  —  Pso,Sj  —  Pso\sjPj 

Psq\sj  represents  the  manager’s  selectability  given  the  agent’s  selectabilities.  If  he  is  recalcitrant 
or  unresponsive,  then  his  perceptions  do  not  change  with  the  selectabilities  of  the  other  agents, 
Pso\sj  =  Pso-  This  makes  an  interesting  model. 

Agent  joint:  Let  us  assume  an  enlightened  stance:  each  agent  is  willing  to  concede  to  any  less 
skilled  than  he.  Let  the  skill  levels  pj 


4.4.3  Negotiation 


(from  Wynn)  With  the  joint  selectability  and  rejectability  functions,  we  form  the  joint  satisficing 
set 

Sb{t)  =  {u  G  lX:ps(u,  s(f))  >  bpR,{u,s{t))} 

Marginal  selectability  and  rejectability: 

PSi{Ui,s{t))  =  ps,,...,S„{uu...,UN,s{t)) 

Uj&Uj 


Individually  satisficing  sets: 

Sl{t)  =  {ui  e  Uii  psi{ui-,s{t))  >  bpRi{ui-,s{t)) 


Then  the  individually  satisficing  sets  and  the  jointly  satisficing  sets  can  be  reconciled  using  nego¬ 
tiation,  as  discussed  above. 
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4.5  Satisfi  cing  negotiation  for  resource  allocation  with  disputed 
resources 

In  this  section,  we  present  a  resource  allocation  study  in  which  resources  are  disputed.  This  mate¬ 
rial  comes  from  [21]. 

Decision  making  agents  acting  together  should  be  influenced  not  only  by  their  own  aspira¬ 
tions  and  budgets  but  by  these  aspects  of  other  agents  in  the  system.  To  represent  this  interaction 
among  agents,  a  notion  of  group  rationality  must  be  embodied  in  the  decision  systems  of  inter¬ 
acting  agents.  Group  rationality  is  not  necessarily  a  logical  consequence  of  rationality  based  on 
individual  self  interest.  Under  a  model  of  rationality  in  which  maximization  of  utility  is  the  oper¬ 
ative  notion,  group  behavior  obtained  by  amalgamation  of  the  individual  behaviors  is  not  usually 
optimized  by  optimizing  each  individual  behavior,  as  is  typically  done  in  a  game-theoretic  setting. 
Those  who  put  their  final  confidence  in  the  limited  perspective  of  exclusive  self-interest  may  ul¬ 
timately  function  disjunctively,  and  perhaps  illogically,  when  participating  in  collective  activities. 
Rather  than  reorient  game  theory  to  accommodate  situations  where  coordination  is  a  more  natural 
operational  descriptor  of  the  game  than  is  self-interested  conflict,  we  propose  to  describe  notions 
that  are  neutral  with  respect  to  questions  of  conflict  and  coordination. 

Beyond  simply  taking  into  account  the  presence  of  other  acting  agents  in  a  system,  there  is 
frequently  some  form  of  sociality  that  is  conducive  to  at  least  a  weak  form  of  congruity  or  mutual 
agreement.  In  cooperative  scenarios,  agents  agree  to  work  together;  in  competitive  scenarios, 
agents  tacitly  agree  to  oppose  each  other.  The  procedures  used  to  arrive  at  these  agreements  are 
not  determined  simply  as  a  function  of  the  preference  structure  of  the  decision  makers,  whether 
posed  in  a  framework  of  self  interest  or  community  interest.  Agreement  among  agents  is  typically 
obtained  via  a  process  of  negotiation,  in  which  multiple  agents  evaluate  and  share  information 
when  they  have  incentive  to  strike  a  mutually  acceptable  compromise.  In  the  negotiating  process, 
it  is  not  sufficient  for  a  decision  maker  merely  to  identify  an  acceptable  joint  solution  (for  the 
community)  according  to  its  own  lights.  The  entire  community  should  “buy  into”  a  joint  solution 
that  is  mutually  acceptable. 

In  negotiation  it  is  rare  that  all  parties  involved  will  tip  their  hands  to  reveal  all  of  the  factors 
influencing  their  decisions.  In  a  competitive  setting,  a  policy  of  secrecy  may  keep  competitors 
from  exploiting  a  weakness,  or  it  may  be  used  to  persuade  competitors  to  a  more  advantageous 
position.  In  a  cooperative  setting,  complete  disclosure  of  information  might  be  precluded  due  to 
restrictions  in  communication  bandwidth  and/or  time.  Because  of  a  lack  of  disclosure,  negotiation 
may  invoke  principles  of  inference,  wherein  agents  attempt  to  estimate  positions  or  attitudes  of 
other  agents  based  on  the  options  they  bring  to  the  bargaining  table. 

In  light  of  these  observations,  some  principles  of  negotiation  are  suggested: 

N-1  Negotiators  must  typically  be  concerned  with  meeting  minimum  requirements  more  than 
achieving  maximum  performance. 

N-2  Negotiations  should  lead  to  decisions  that  are  both  good  enough  for  the  group  as  a  whole  (as 
established  by  a  group  rationality)  as  well  as  good  enough  for  each  individual  (as  established 
by  local  preferences). 
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N-3  Negotiation  is  typically  an  iterative  process.  Starting  from  a  set  of  initial  joint  options,  it  is 
natural  to  iterate  toward  solutions  which  are  individually  acceptable,  rather  than  attempting 
to  move  directly  to  joint  options  which  are  a  best  compromise. 

N-4  Negotiation  may  frequently  incorporate  elements  of  inference. 

A  rich  model  for  negotiation  should  be  able  to  capture  other  aspects  of  the  negotiation  process, 
such  as  recalcitrance  (resistance  to  accede  to  group  preferences),  accommodation  (openness  to 
group  preferences),  or  annoyance  over  extended  or  unchanging  negotiation  positions. 

In  this  paper,  we  will  briefly  review  the  concept  of  praxeic  utility  decision  theory  as  a  means 
of  implementing  satisficing  control,  then  extend  this  to  multiple  agent  decision  making  to  model 
group  rationality.  Concepts  of  negotiation  consistent  with  the  principles  outlined  above  are  es¬ 
tablished  using  this  multi-agent  satisficing  framework.  As  case  study  of  a  problem  for  which  a 
negotiated  solution  is  reasonable,  a  problem  of  resource  allocation  with  disputed  resources  is  mod¬ 
eled. 

4.5.1  Satisfi  cing  decision  making:  single  and  multiple  agents 

Single  agent  satisficing 

Satisficing,  a  term  coined  by  Simon  [22],  refers  to  a  decision  making  strategy  in  which  options 
are  selected  which  are  “good  enough,”  differing  thereby  from  conventional  approaches  which  seek 
only  the  best.  From  the  satisficing  viewpoint,  being  “good  enough”  is  sufficient;  insisting  on  the 
best  and  only  the  best  via  an  optimizing  algorithm  may  be  an  overly  restrictive  luxury.  From  an 
operational  point  of  view,  however,  while  establishing  that  an  option  is  (at  least  locally)  optimal  is 
at  least  expressible  as  a  optimization  problem,  establishing  what  is  “good  enough”  appears  to  be 
more  elusive.  The  question  of  establishing  good  enough  choices  is  addressed  from  a  philosophical 
point  of  view  with  regard  to  truth  systems  by  Levi  [23,  24,  25].  In  this  epistemological  framework, 
known  as  epistemic  utility,  options  are  sought  for  which  the  amount  of  information  associated  with 
them  exceeds  the  potential  for  error.  All  options  are  deemed  acceptable  —  good  enough  —  that 
pass  a  likelihood  ratio  test  comparing  a  truth  valuation  (a  probability)  and  an  informational  value  of 
rejection  (also  constructed  as  a  probability).  Application  of  epistemic  utility  to  control  problems 
yields  praxeic  utility  theory  [26,  27,  28,  29,  30,  31,  32,  33].  (For  a  discussion  of  praxeic  utility 
theory  in  the  context  of  negotiation,  and  for  a  more  complete  development  of  these  concepts,  see 
[34].)  In  this  theory,  a  selectability  function  ps{u)  is  formed  which,  for  each  option  u  available 
from  a  universe  of  options  U  available  to  a  decision  making  agent  X,  measures  the  degree  to  which 
u  works  toward  success  in  achieving  the  agent’s  goals.  Also,  a  rejectability  function  pij(M)  is  estab¬ 
lished  which  measures  costs  associated  with  each  option.  This  pair  of  measures,  called  collectively 
the  satisfiability  functions,  are  endowed  with  the  mathematical  structure  of  probabilities  (e.g.,  they 
are  nonnegative  and  sum  to  1  on  the  U). 

Definition  4.1  The  satisficing  set  Eg  is  the  set  of  options  defined  by 

Eg  =  {m  e  (7:  Ps{u)  >  qpR{u)}.  (4.2) 
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□ 

The  satisficing  set  consists  of  those  options  for  which  the  benefit  exceeds  the  cost:  the  set  of 
alternative  which  are  arguably  “good  enough.”  There  may  be  more  than  one  option  in  Moving 
away  from  strict  adherence  to  optimality  increases  the  flexibility,  while  by  not  retaining  only  the 
best.  Ultimate  selection  of  a  single  option  for  action  is  accomplished  by  means  of  a  tie  breaking 
rule,  such  as  most  selectable,  least  rejectable,  or  maximally  discriminating. 

The  parameter  q  in  (4.2)  is  the  index  of  caution.  As  q  is  increased,  fewer  options  are  accepted 
into  the  the  satisficing  set.  As  such,  the  agent  exhibits  greater  caution,  accepting  only  options  of 
higher  merit  in  comparison  to  their  cost.  We  say  that  Eg  is  the  satisficing  set  at  level  q.  Because  of 
its  similarity  to  likelihood  ratio  tests  in  conventional  decision  theory,  the  test  in  (4.2)  is  referred  to 
as  the  praxeic  likelihood  ratio  test  (PLRT). 

Multiple  agent  satisficing 

Satisficing  decision  theory  extends  very  naturally  to  multiple  agent  systems.  Satisficing  admits 
degrees  of  fulfillment,  whereas  optimization  is  an  absolute  concept.  While  the  statement  “What 
is  best  for  me  and  what  is  best  for  you  is  also  jointly  best  for  us  together”  may  be  nonsense,  the 
statement  “What  is  good  enough  for  me  and  what  is  good  enough  for  you  is  also  jointly  good 
enough  for  us  together”  may  be  perfectly  sensible,  especially  when  we  do  not  have  inflexible 
notions  of  what  it  means  to  be  “good  enough.”  Satisficing  grants  room  for  compromise,  leaving 
open  the  opportunity  for  one  or  more  agents  involved  to  relax  standards  of  individual  performance 
in  the  interest  of  the  good  of  the  community.  A  theory  of  multi-agent  satisficing  thus  provides  the 
stage  on  which  the  act  of  negotiation  can  reasonably  be  presented. 

Since  they  possess  the  mathematical  structure  of  probabilities,  selectability  and  rejectability 
can  be  naturally  extended  to  the  multivariate  case  by  defining  joint  selectability  and  rejectability 
measures,  which  may  be  used  to  determine  a  jointly  satisficing  set.  In  addition,  individual  decision 
makers  may  establish  individual  notions  of  satisficing  by  computing  marginal  selectability  and 
rejectability  functions  from  the  joint  expressions. 

Let  Xi,...,Xn  be  N  interacting  agents,  where  each  agent  has  its  own  decision  space  Ui. 
The  joint  action  space  is  the  space  U  =  Ui  x  U2  x  ■  ■  ■  x  Un-  A  joint  decision  is  an  element 
u  =  (til,  M2, . . . ,  Mat)  G  U.  We  denote  the  Ah  element  of  u  as  u(i). 

An  act  by  any  single  member  of  a  multi-agent  system  has  potential  ramifications  throughout 
the  entire  community.  And,  although  a  participant  may  perform  an  act  either  in  its  own  interest  or 
for  the  benefit  of  others,  the  act  is  usually  not  implemented  free  of  cost:  resources  are  expended  or 
risk  is  taken,  perhaps  by  the  single  agent,  but  also  perhaps  by  the  entire  community.  Although  these 
consequences  may  be  defined  independently  from  the  benefits,  the  measures  associated  with  bene¬ 
fits  and  cost  cannot  necessarily  be  specified  independently  of  each  other.  In  light  of  this,  the  object 
representing  the  relationships  between  agents  in  their  systems  regarding  their  individual  and  joint 
selectability  and  rejectability  is  an  interdependence  measure  which  combines  both  rejectability  and 
selectability  for  all  agents  as  a  multivariate  probability  function  of  the  form 
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This  is  expressed  more  briefly  as  Ps,r(u,  v).  Values  for  the  interdependence  measure  are  typically 
obtained  by  means  of  factorization  into  constituent  conditional  and  marginal  probabilities.  In  these 
factorizations,  agents  may  represent  how  their  selectabilities  or  rejectabilities  are  affected  by  the 
selectability  and  rejectability  of  other  agents.  From  the  general  interdependence  function,  the  joint 
selectability  function  is  obtained  by  a  marginalization 

Ps(u)  =  ^Ps,r(u,v) 
veu 

and  similarly 

Pr(v)  =  ^Ps,r(u,v) 
ueu 


Definition  4.2  The  multipartite  satisficing  decision  rule  defines  the  set  multipartite  satisficing  set 
by 

Eg  =  {u  e  U:  Ps(u)  >  gpR(u)}.  (4.3) 

□ 

Joint  options  in  Eg  are  those  for  which  the  benefits  exceed  the  costs,  as  viewed  from  the 
perspective  of  the  group  and  as  represented  by  the  joint  selectability  and  joint  rejectability.  The 
test  in  (4.3)  is  referred  to  as  the  joint  praxeic  likelihood  ratio  test  (JPLRT). 

Given  joint  selectability  and  joint  rejectability,  an  individual  agent  can  compute  marginals  by 

PSi{Ui)  =  ^  Ps(u) 

{u£U'.  u(i)=tii} 

PR^i^i)  = 

{vet/:  v(i)=iJi} 

The  resulting  individually  satisficing  set  for  Xj  is  then 

E*  =  {ui  e  Up  psi{ui)  >  qpRiiui)}- 

Alternatively,  an  agent  could  employ  a  and  pR^  not  obtained  as  a  marginal,  presenting  a  different 
face  to  the  public  than  what  it  holds  for  itself,  and  use  these  function  to  compute  S* . 

We  will  use  the  notation  m  =  u(i)  to  indicate  that  the  option  u  G  Ui  is  the  Ah  element  of  a  joint 
option  vector  u. 

An  option  u  that  is  jointly  satisficing  for  Xj,  is  not  necessarily  individually  satisficing  for  Xj. 
That  is,  a  joint  option  u  G  E^  does  not  necessarily  have  u(i)  G  S^.  The  converse,  however,  is  true: 
if  M  G  S* ,  then  u  =  u(i)  for  some  u  G  E^.  This  is  established  by  the  following. 

Theorem  1  (The  negotiation  theorem)  Ifu  is  individually  satisficing  for  Xj,  then  it  must  be  the  ith 
element  of  some  jointly  satisficing  vector  u,  i.e.,  u  =  \i(i)  for  some  u  G  Eg. 
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Proof  Without  loss  of  generality,  assume  i  =  1.  Let  m  G  Sj  (i.e.,  psiiu)  >  qpR^{u)).  To 
establish  proof  by  contradiction,  assume  that  u  ^  u(i)  for  all  u  G  It  follows  that  for  all 

V  G  f/2  X  ■  ■  ■  X  Un,  Psiu,  v)  <  qpniu,  v).  Then 

Psi{u)  =  ^Ps(m,v)  <  q^Pn{u,v)  =  qpR^{u), 

V  V 

which  contradicts  u  G  □ 

On  the  basis  of  the  negotiation  theorem,  it  may  be  argued  that  each  agent  has  a  seat  at  the  negotia¬ 
tion  table.  No  one  is  necessarily  frozen  out  of  a  deal. 

It  is  important  to  emphasize  what  the  negotiation  theorem  does  not  provide.  If  is  individually 
satisficing  for  Xi,  and  U2  is  individually  satisficing  for  X2,  then  by  the  theorem  ui  =  u(l)  and 
U2  =  u(2)  for  some  u,  u  G  Sg,  However,  it  is  not  necessarily  the  case  that  u  =  u:  the  options 
that  are  both  individually  and  jointly  satisficing  may  be  different  for  different  agents.  Thus,  the 
negotiation  theorem  does  not  establish  a  “solution”  to  the  problem.  However,  it  provides  the  basis 
upon  which  a  solution  may  be  sought  through  an  iterative  negotiation  scheme.  To  obtain  buy-in 
from  all  agents,  the  options  that  are  individually  satisficing  for  each  agent  must  be  elements  of  the 
same  jointly  satisficing  options.  Such  options  are  the  result  of  negotiation. 

The  proof  of  the  negotiation  theorem  makes  use  of  the  fact  that  the  same  index  of  caution  is 
used  to  compute  S*  as  is  used  for  Eg,  but  an  agent  could  use  a  higher  q  to  determine  than  it 
does  for  its  own  satisficing  set.  The  negotiation  theorem  is  not  necessarily  true  in  this  case.  Or  it 
may  happen  that  each  agent  has  its  own  perception  of  the  interdependence  function.  In  this  case, 
we  denote  X/s  interdependence  function  as  and  the  corresponding  joint  selectability  and 
rejectability  as  pg  and  pj^.  Again,  the  negotiation  theorem  applies  to  each  agent  separately,  but  not 
collectively.  Another  possibility  is  that  an  agent  may  use  joint  functions  pg  and  pj^  for  establishing 
S*,  but  use  individual  ps^  and  pr-^  not  computed  as  marginals  of  ps  and  pr,  thereby  present¬ 
ing  a  different  “public  face”  and  “private  face.”  Again,  the  negotiation  theorem  does  not  apply. 
(This  raises  the  the  question  for  future  investigation:  to  what  degree  can  these  private  satisfiability 
functions  differ  from  marginal  satisfiability  functions  and  still  have  reasonable  negotiations.) 

The  negotiation  theorem,  and  these  observations  about  its  provisional  application,  motivate  the 
development  of  algorithms  for  negotiation  based  on  satisficing. 

4.5.2  Satisfi  cing  Negotiation 

Multi-agent  satisficing  is  suited  to  the  principles  of  negotiation  outlined  in  the  introduction.  By 
admitting  degrees  of  fulfillment,  satisficing  agents  can  explore  options  which  are  both  mutually 
and  individually  good  enough.  In  a  negotiation  process,  however,  it  is  not  sufficient  for  a  decision 
maker  to  identify  a  solution,  even  one  which  it  views  as  being  jointly  acceptable.  As  mentioned 
above,  other  agents  in  the  system  may  have  their  own  models  of  the  joint  interdependence  function, 
and  their  own  individual  satisficing  functions,  or  may  determine  individually  and  jointly  satisficing 
options  not  coincident  with  those  of  other  agents.  A  negotiated  solution  should  ideally  be  one  in 
which  all  members  of  the  community  can  individually  concur.  The  multipartite  satisficing  set  Eg 
and  the  individually  satisficing  set  E*  provide  each  X*  with  a  basis  for  negotiation:  an  assessment 
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from  each  agent’s  point  of  view  of  all  options  that  are  good  enough  for  the  group,  and  of  the 
assessment  of  all  individual  options  that  are  good  enough  for  itself. 

Compromise  among  a  group  of  agents  involves  a  lowering  of  standards  by  admitting  possible 
actions  that  an  agent,  acting  only  unilaterally,  would  not  necessarily  prefer,  but  which  it  is  willing 
to  admit  in  the  interest  of  acting  as  part  of  a  group.  The  lowering  of  standards  motivates  the  use 
of  a  satisficing  outlook  in  the  decision  theory.  An  approach  based  on  optimization,  particularly 
one  formulated  on  the  basis  of  exclusive  self-interest,  does  not  admit  grades  or  degrees.  A  choice 
is  either  optimal,  or  it  is  not.  Compromise  does  not  necessarily  entail  complete  abolition  of  any 
agent’s  standards.  An  agent  feeling  that  too  much  compromise  is  imposed  it  may  walk  away  from 
the  negotiating  table. 

In  the  formalism  of  multi-agent  satisficing,  an  agent’s  index  of  caution  qi  acts  as  a  parameter 
representing  the  degree  of  compromise  an  agent  is  willing  to  adopt.  By  lowering  the  degree  of 
caution,  an  agent  is  willing  to  consider  placing  more  options  in  its  satisficing  set.  As  g*  — 0,  every 
option  available  to  Xi  is  satisficing  for  Xj.  If  all  agents  are  willing  to  sufficiently  reduce  their 
standards,  a  jointly  acceptable  solution  can  be  obtained. 

We  will  let  q  =  (gi, . . . ,  gjv)  denote  the  caution  vector  of  the  players.  The  least  cautious  index 
is  ql  =  min{gi, . . . ,  g^v}.  From  an  individual  perspective,  the  negotiation  theorem  applies  if  an 
agent  uses  its  own  index  of  caution,  g*  to  determine  the  individually  satisficing  set,  but  uses  g^  to 
determine  its  jointly  satisficing  set.  It  is  assumed  hereafter  that  each  agent  uses  g  =  g^  to  determine 
Eg.  This  reflects  the  conservative  observation  that  the  standards  of  a  group  can  be  no  higher  than 
the  standards  of  any  member  of  the  group. 

The  relationship  between  individually  and  jointly  satisficing  sets  for  an  agent  is  formalized  by 
the  following: 

Definition  4.3  The  set  of  all  jointly  satisficing  vectors  in  S*  that  are  also  individually  satisficing 
for  Xi  is  the  compromise  set  Cj,  defined  by 

Ci  =  {u  =  •  •  •  ,M7v}  e  Ui  e  SgJ. 


□ 

Since  Ql  ^  Qi  the  negotiation  theorem  indicates  that  C*  is  not  empty.  By  the  negotiation  theorem, 
if  M  G  then  there  is  some  u  G  C*  such  that  u  =  u(f). 

We  define 

Cj(j)  =  {uf  Uj  G  u  for  some  u  G  Cj} 
as  the  set  of  all  options  for  Xj  in  C*. 

Definition  4.4  The  joint  accord  set  Ng^  is  the  set  of  all  vectors  that  are  jointly  (at  caution  level 
qi)  and  individually  (at  caution  level  g*)  satisficing  for  all  agents.  That  is, 

N 

N„  =  flC,. 

i=l 

Any  joint  option  u  G  Ng^  is  a  joint  accord  option.  □ 
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The  desired  outcome  of  a  negotiation  algorithm  is  a  joint  accord  set.  From  a  single 
element  joint  accord  option  may  be  selected  according  to  some  tie  breaking  rule,  such  as  the  rule 
maximizing  joint  benefit  to  cost  ratio, 

*  Ps(u) 

u  =  arg  max  — .  (4.4) 

Pr(u) 


This  option  is  called  the  rational  compromise. 

If  =  0,  then  there  are  no  decisions  which  are  jointly  and  individually  acceptable  to  all 
agents  in  the  system. 

Definition  4.5  In  the  context  of  multi-agent  satisficing  theory,  negotiation  is  the  process  of  working 
toward  a  solution  which  is  individually  and  jointly  acceptable  to  each  agent  in  the  system.  □ 

Working  toward  a  negotiated  solution  requires  at  least  one  of  the  agents  to  lower  its  standards, 
then  recompute  their  compromise  sets.  If  no  agent  is  willing  to  compromise  further  (by  lowering  its 
own  standards),  then  an  impasse  is  reached  in  the  negotiation  process.  Until  that  point  is  reached, 
however,  negotiations  may  proceed  in  good  faith.  The  lowering  of  standards  is  represented  in 
this  context  by  a  lowering  of  an  agents  index  of  caution.  Algorithm  4.1  outlines  a  negotiation 
algorithm  based  on  this  observation,  termed  the  Enlightened  Liberals  algorithm  (“liberal”  in  the 
sense  of  being  tolerant  of  views  other  than  one’s  own;  “enlightened”  in  the  sharing  of  information). 


Algorithm  4.1  The  Enlightened  Liberals  Negotiation  Algorithm 


Step  f :  Initiafize:  qi  =  Qqiior  i  =  1, . . . ,  N .  ql  =  min(ogi). 

Step  2:  Xi  forms  and  S*,. 

Step  3:  Xi  forms  Ci  =  {u  G  m  G  E^,}. 

Step  4  :  Communieate  C*  and  qi  to  other  agents. 

Step  5:  Eaeh  agent  forms 

Step  6:  If  Ng^  =  0,  eaeh  agent  determines  how  mueh  to  tower  qi,  then  eommunieates  qi  with  the  other 
agents.  Then  repeat  from  step  2. 

Step  7:  If  Ng^  /  0,  form  the  rational  eompromise  u  =  (tti,  tt2, . . . ,  un)  aeeording  to  (4.4). 

In  this  algorithm,  all  agents  communicate  their  choices  in  the  same  step.  There  is  no  way  to  use  the 
partial  information  provided  by  another  agent’s  compromise  set  to  modify  an  agent’s  decisions. 

Inference  in  negotiation 

It  may  be  noted  that  Enlightened  Liberals  is  in  accord  with  the  first  three  principles  of  negotiation 
outlined  in  the  introduction.  However,  no  inference  is  employed  in  the  algorithm  as  stated,  since 
all  agents  essentially  pass  information  simultaneously.  The  inference  problem  faced  by  agent 
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Xi  is  to  estimate  p^^.,  ps^,  and  pr^  —  the  praxeie  system  employed  by  Xj  —  given  the 
offered  solutions  brought  to  the  negotiating  table  in  the  form  of  Cj.  As  an  estimation  problem 
all  the  tools  of  statistieal  estimation  theory  ean  be  brought  to  bear,  sueh  as  Bayesian  estimates, 
maximum  likelihood,  minimum  varianee,  maximum  entropy,  ete.  (The  method  seleeted  is  problem 
dependent.)  In  the  example  presented  below,  a  heuristie  is  illustrated  whieh  is  similar  to  maximum 
likelihood. 

Ineorporation  of  the  inferenee  aspeet  of  the  negotiation  is  outlined  in  algorithm  4.2,  whieh 
differs  from  Enlightened  Liberals  mostly  in  the  sequenee  nature  of  the  exehange  of  information. 


Algorithm  4.2  The  Inferring  Liberals  Negotiation  Algorithm 
Step  1:  Initialize:  qi  =  oQi  for  i  =  1, N .  qr  =  min(ogi). 

Step  2  :  Xi  infers  updates  for  Ps  ’Pr  ’Ps  Pb.’  on  the  eompromise  sets  for  {  C ^ ,  j  =  1 , 2 , . . . ,  A,  j  / 

z}. 

Step  3:  Xi  forms  and  S* , 

Step  4:  Xi  forms  C^. 

Step  5  :  Xi  eommunieates  Cj  and  qi  to  all  other  agents. 

Step  6:  After  all  agents  have  transmitted  their  information,  eaeh  agent  forms  =  n^^Cj. 

Step  7:  If  =  0,  eaeh  agent  determines  how  mueh  to  lower  qi,  then  eommunieates  qi  with  the  other 
agents.  Then  repeat  from  step  2. 

Step  8:  If  /  0,  form  the  rational  eompromise  u  =  {ui,U2,  ■  ■  ■ ,  un)  aeeording  to  (4.4). 


4.5.3  Example:  Disputed  Resource  Allocation 

Complexity  is  no  argument  against  a  theoretieal  approaeh  if  the  eomplexity  arises  not  out  of 
the  theory  but  out  of  the  material  whieh  any  theory  ought  to  handle.  —  Frank  Palmer 

Grammar  (1971) 

We  illustrate  the  negotiation  framework  outlined  above  —  including  interdependence  factor¬ 
ization,  establishing  satisfiability  functions  and  inference  —  by  means  of  a  resource  allocation 
problem.  Consider  a  situation  in  which  N  agents  are  to  allocate  M  resources  among  themselves  in 
such  a  way  that  all  resources  are  allocated  to  at  least  one  agent,  but  more  than  one  agent  may  claim 
a  resource.  A  resource  with  more  than  one  claimant  is  a  disputed  resource.  Disputed  resources 
are  of  lower  value  than  undisputed  resources,  both  because  utilization  of  a  resource  is  attenuated 
by  virtue  of  sharing,  and  because  of  an  intrinsic  societal  valuation  that  would  avoid  dispute.  The 
goal  of  each  agent  is  to  obtain  as  much  of  the  resources  as  possible  (or  the  resource  allocation  with 
the  maximum  valuation),  while  working  toward  having  the  fewest  disputed  resources  as  possible. 
Starting  from  some  initial  allocation  of  resource,  each  agent  must  also  sustain  a  cost  of  acquisition 
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for  each  additional  resource  that  is  required.  While  expressed  as  an  abstract  “resource  allocation” 
problem,  it  may  be  helpful  to  envision  the  geographical  division  of  a  country  among  non-aligned 
factions.  The  apportionment  of  the  land  of  Israel  among  Israeli  and  Palestinian  claimants  is  a  re¬ 
cent  motivating  example.  Interestingly,  data  that  might  be  adapted  for  a  problem  on  a  larger  scale 
for  Europe  in  the  mid-twentieth  century  have  been  the  subject  of  research  in  studies  in  cooperation 
and  complexity  (see,  e.g.,  [35]). 

We  will  consider  specifically  only  the  two  agent  case;  extensions  to  more  agents  is  straight¬ 
forward  in  principle.  We  will  denote  the  allocation  decision  vector  of  agent  Xi  by  a  vector 

G  {0, 1}^  where  =  1  if  resource  j  is  selected  by  agent  i.  (A  final  option  denoted  as 

=  0  might  also  be  used  to  indicate  that  is  terminating  the  negotiation  process  and  walking 
away  from  the  negotiating  table.)  The  decision  vector  is  used  to  indicate  the  Boolean  comple¬ 
ment  of  the  decision  vector  u\  For  decision  vectors  and  u^,  we  will  denote  hy  d  =  the 

disputed  resources  claimed  by  both  agents^  We  denote  by  vd\d  the  resources  claimed  exclusively 
by  Xi.  As  agents  begin  the  process,  they  have  some  initial  allocation  and  with  disputed 
resources 

od  =  0"^^  ■ 

Formulation  of  selectability  and  rejectability 

The  joint  interdependence  function  is  Ps,r(u,  v)  =  psi,52,Ri,i?2(“i)  “2,  vi,  V2).  When  we  want  to 
represent  explicitly  that  this  is  the  interdependence  function  as  perceived  by  agent  Xi,  this  will  be 
denoted  as  ^2(^1)  “2,  V2).  If  this  is  changing  with  negotiation  iteration  number  p,  we 

will  indicate  this  with  ^2;  v)- 

To  formulate  specific  results,  it  is  expedient  to  factor  the  joint  interdependence  function  into 
conditional  and  marginal  probability  measures.  Conditional  probabilities,  as  observed  by  Pearl 
[36]  permit  local  or  specific  responses  to  be  characterized.  Conditional  behavior  is  behavior  at 
the  local  level,  with  all  dependencies  specified.  Such  factorizations  permit  characterization  of 
global  behavior  in  terms  of  local  relationships,  which  are  frequently  easier  to  specify.  A  variety 
of  factorizations  are  possible,  even  for  the  simple  case  of  two  agents,  and  it  is  not  necessary  for 
all  agents  to  invoke  the  same  factorizations.  However,  in  this  example,  both  agents  will  factor  the 
joint  interdependence  function  the  same  way. 

We  will  express  the  factorization  from  the  point  of  view  of  Xi,  then  express  the  inference  pro¬ 
cess  from  the  point  of  view  of  X2  (using  information  from  Xi).  As  a  shorthand  we  will  represent 
the  factorization  of  the  probability  functions  in  terms  only  of  their  variables,  using,  for  example, 
S2\R2  as  a  representation  for  ps2\R2iu2\v2)-  A  reasonable  (but  not  unique)  factorization,  expressed 
from  the  point  of  view  of  Xi  is 

{Si,S2,RuR2)  = 

iS^\S2,Rl,R2)iRl\S2,R2)iS2\R2)iR2)- 

^The  notation  ui  n  U2  might  be  more  exactly  represented  as  ui  A  U2,  where  a  ‘bitwise”  AND  operation  is  implied 
in  each  element.  However,  the  n  notation  seems  to  be  more  suggestive. 
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Under  the  assumption  that  rejeetability  and  seleetability  are  independent  for  a  given  agent,  we 
obtain 

iS^,S2,Rl,R2)  =  iSi\S2,R2)iRl\S2:R2)iS2)iR2)  (4.5) 

There  is  an  attraetive  symmetry  in  the  first  two  faetors,  being  the  seleetability  and  rejeetability 
(respeetively),  eonditioned  on  those  quantities  for  the  other  agent.  The  first  two  terms  of  this 
factorization  represent  Xi’s  seleetability  and  rejeetability,  respectively,  when  X2  places  all  of  its 
seleetability  mass  and  rejeetability  mass  as  the  conditioning  arguments.  The  factorization  in  (4.5) 
is  expressed  more  explicitly  as 

Pk\S2,R2  (“1 1“2,  V2)Pr,\S2,R^  (^1  I“2,  V2)pI^  {u2)Pr^  {V2) .  (4.6) 
In  the  sections  below,  we  describe  the  parameters  that  affect  each  of  the  terms  in  this  factorization. 

Goals 

Each  agent  wants  to  maximize  the  value  of  the  resources  it  claims.  There  is  a  functional  g^{u) 
adopted  by  Xi  evaluating  its  intended  allocation.  This  might  be  quite  simple,  as  in^ 

g‘{u)  =  Y.^‘U), 

j&u 

where  e*(j)  measures  the  intrinsic  value  of  resource  j.  This  may  include  the  size  of  the  resource 
as  well  as  other  attributes.  (In  the  case  of  land  as  a  resource,  it  might  measure  attributes  such  as  a 
harbor  or  an  airport,  mineral  or  agricultural  assets,  or  historical  or  religious  value.)  In  the  case  that 
the  resources  are  distributed  in  space,  the  value  may  be  determined  using  less  localized  measures. 
For  example,  there  may  be  more  value  in  having  the  resources  as  near  to  each  other  as  possible,  or 
in  a  contiguous  block.  Or  there  may  be  less  value  to  a  resource  which  is  surrounded  by  resources 
claimed  by  the  other  agent.  (In  the  case  of  land  apportionment,  an  agent  might  prefer  large  pieces 
contiguously  joined,  with  no  islands  of  other  agents’  land  in  the  middle.)  All  of  these  variations 
can  be  incorporated  into  g''{u). 

Given  that  the  agents’  goals  are  prescribed  by  the  desire  to  obtain  more  resources,  we  simply 
normalize  the  allocation  value  to  form  a  probability  mass  function: 

PSi|52,R2(“ll“2,t'2)  =PSi(«l) 

That  is,  the  the  seleetability  is  conditionally  independent  of  X2’s  options.  In  the  joint  seleetability 
each  agent  thus  acts  independently: 

PSuS2iUuU2)  =  P5i(Mi)PS2(«2). 

In  (4.6),  the  tQxmpg^{u2)  is  Xi’s  model  (or  perception)  of  X2’s  seleetability.  This  is  determined 
simply  Xi’s  estimate  of  9^{u)  (estimated  according  to  Xi’s  knowledge  of  X2). 

^The  set  {j  G  u}  appearing  in  this  summation  is  a  shorthand  for  {j:  Uj  =  1} 
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Costs 


Several  elements  of  the  problem  eontribute  to  an  agent’s  pereeption  of  the  eost  of  the  ehoiee. 

Reduce  disputed  resource 

Each  agent  evidences  the  difficulty  of  sharing  the  resource  by  seeking  to  eliminate  the  disputed 
resources.  This  not  only  serves  his  purposes  —  since  disputed  resources  may  not  be  enjoyed  at  full 
value  —  but  also  makes  a  concession  to  the  society  of  the  agents,  which  would  prefer  undisputed 
allocations. 

In  general,  there  is  a  cost  function  associated  with  disputed  resources,  which  for  agent  Xj  is 
denoted  as  n  v^).  This  could  depend  on  a  variety  of  societal  or  historical  factors.  In  some 
disputed  resources,  there  may  be  no  cost  associated  with  more  than  one  claim  on  the  resources, 
whereas  for  others  there  is  considerable  cost. 

A  simple  model  for  the  cost  is  simply  to  make  the  disputation  cost  function  proportional  to  the 
value  of  the  disputed  resource  to  each  agent, 

5\u^  n  u^)  =  5''{d)  oc  ^  e]  +  el- 

i&d 

This  cost  can  be  placed  in  the  context  of  the  conditional  probability  p]ij^\s2  V2)  as  follows. 

•  When  U2  =  V2,  then  X2  places  all  of  its  selectability  and  none  of  its  rejectability  on  the 
vector  U2,  so  it  is  fully  committed  to  the  option  U2.  Then  it  may  be  presumed  that  there  will 
be  a  disputation  of  d  =  viD  U2,  and  the  cost  becomes  5^{v2  n  U2). 

•  When  U2  =  V2,  then  X2  places  all  of  its  selectability  as  well  as  all  of  its  rejectability  on 
U2,  and  hence  is  conflicted.  In  this  conflicted  state,  Xi  assumes  half  the  cost  of  disputed 
territory,  \5^{v2  fl  U2).  (Other  options  are,  of  course,  possible  to  deal  with  this  conflicted 
state.) 

•  For  those  territories  that  X2  has  indicated  that  it  doesn’t  want  (no  selectability  and  high 
rejectability),  there  is  no  cost  for  a  disputed  territory. 

An  overall  rejectability  function  based  on  disputation  can  be  formulated  by  normalization.  We  will 
call  this  rejectability  function  r^.s{ui\u2.,  V2). 

Cost  of  Acquisition 

There  is  a  cost  associated  with  acquiring  the  resources  beyond  the  initial  allocation.  (For  example, 
in  the  case  of  land  resources,  simply  making  the  decision  to  acquire  the  land  does  not  make  it  so.  It 
may  be  necessary  to  deploy  troops  to  enforce  the  decision,  or  to  move  in  colonists,  etc.)  The  cost 
of  acquisition  will  also  depend  on  the  interest  that  the  competing  agent  has  in  the  new  territory. 
In  the  context  of  the  conditional  probability  p\i^\s2  1^2X2),  the  following  observations  can  be 

made. 
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•  In  the  case  that  U2  =  V2  (that  is,  X2  puts  all  of  its  selectability  and  none  of  its  rejectability 
on  the  vector  U2),  then  it  may  be  presumed  that  there  will  be  a  dispute  over  d  =  viCi  U2.  The 
cost  of  the  disputed  acquisition  is  denoted  by 

X^id; 

while  the  cost  of  the  undisputed  acquisition  is 

x\v^\d; 

The  total  cost  is  then  the  sum  of  these: 


x\v^]ou\oU^)  = 


x\v\d;  ou^)  +  X^{d;  qu^,  qu^) 


•  When  U2  =  V2,  that  is,  X2  is  conflicted,  placing  all  of  its  selectability  as  well  as  all  its 
rejectability  on  U2,  then  Xi  might  assume  that  X2  will  not  be  in  disputation;  then 


•  For  those  options  on  which  X2  places  none  of  its  selectability  and  all  of  its  rejectability  on, 
we  will  assume  that  same  result  as  when  X2  is  conflicted.  (More  generally,  we  could  have  a 
reduced  cost  for  acquisition.) 

•  In  the  more  general  case,  X2  may  be  conflicted  in  some  areas  but  not  in  others.  In  this  case, 

Xi  only  counts  as  disputed  those  territories  which  intersect  with  its  interests  and  for  which 
X2  is  unconflicted. 

Combining  these  costs  together  and  suitably  normalizing,  the  rejectability  function  ^2) 

is  obtained. 


Cost  of  negotiation 

An  agent  may  attribute  cost  to  the  process  of  negotiation.  If  the  negotiation  must  proceed  through 
several  iterations,  an  agent  may  become  sufficiently  annoyed  at  the  process  that  its  response  is  to 
walk  away  from  the  negotiating  table.  Several  factors  may  be  incorporated  into  the  cost  of  negoti¬ 
ation,  including  the  number  of  iterations  (which  we  denote  by  rf),  or  the  apparent  lack  of  progress 
(if  the  compromise  sets  coming  from  other  agents  appears  to  be  unchanging).  A  cost  based  on 
the  number  of  iterations  can  also  represent  determination  of  an  agent  to  with  respect  to  certain 
options:  while  the  overall  boldness  is  decreasing,  the  rejectability  of  some  options  can  be  corre¬ 
spondingly  increased  to  partially  offset  the  reduction.  The  cost  of  negotiation  is  represented  by 
od{ui,  U2,V2]  rj);  suitably  normalized  it  becomes  the  rejectability  function  R2  a(“i|“2,  U2;  rj) 
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Overall  conditional  rejectability 

The  conditional  rejectability  function  in  (4.6)  is  expressed  as  a  convex  sum  of  the  rejectability 
functions  described  above: 


Pk|S2,K2(“ll“2,t'2)  = 

/^l^^k|S2,iJ2;5(“ll“2,  t'2)  +  ^2)  + 

/^3Pk|S2,iJ2;a(“ll“2,t'2;^)  (4.7) 


where  A  =  1- 


The  marginal  p)j2(^2)  and  joint  rejectability 

The  quantity  p]i^{v2)  in  (4.6)  is  Xi’s  model  of  X2’s  marginal  (unconditional)  rejectability.  This 
is  viewed  (in  this  formulation)  as  separate  parameter,  not  a  derived  quantity.  A  variety  of  factors 
influence  the  joint  rejectability.  Even  if  the  factors  could  be  computed  exactly,  the  weighting 
factors  in  the  combination  may  be  unknown.  The  difficulty  of  estimating  this  reliably  suggests  the 
need  to  estimate  this  quantity,  if  possible,  during  the  negotiating  process.  Inference  of  is 

discussed  below. 

As  the  negotiating  process  begins,  some  initial  condition  is  needed.  One  initial  condition  re¬ 
flecting  this  uncertainty  is  to  assume  that  p]i^{v2)  apportions  equal  rejectability  to  all  options. 
Another  approach  is  to  allow  p/jj  (^2)  —  as  an  unconditional  measure  —  to  reflect  those  aspects  of 
the  problem  that  are  most  independent  of  actions  or  goals  of  other  agents.  In  this  light,  allowing 
p]i^{v2)  to  be  proportional  to  the  cost  of  acquisition  is  reasonable, 

pk(^2)  OC  x‘^{v2;oU^)  +S^{V2,0U^ 

It  is  straightforward  to  verify  that  the  joint  rejectability  can  be  computed  as 


PRuR2{vi,V2)  = 


Pk\S2,R2  (^1 1“2,  -^2)^52  («2)Pr2  (^2)  • 

U2^U2 


Inference 

We  consider  now  the  question  of  inference  of  the  parameters  of  other  agents  during  the  course  of 
negotiation,  presenting  a  method  which  is  reasonable  in  the  context  of  the  present  problem.  After 
its  initial  decision-making  step,  Xi  presents  Ci  and  qi  to  X2.  Based  on  this  compromise  set  and 
caution  index,  what  can  be  inferred  about  Xi’s  satisfiability  functions?  Because  X2  will  be  doing 
its  computations  based  on  the  factorization  (4.6),  it  may  be  assumed  that  that  X2  has  a  model  of 
Ps^  (u),  since  this  is  based  primarily  on  economic  questions  which  are  observable  by  all  agents.  As 
mentioned  above,  however,  {u)  is  difficult  to  obtain  without  further  information.  This  param¬ 
eter  influences  the  joint  rejectability  p\^  and  hence  the  marginal  p\^,  so  its  estimation  has  an 
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extended  influence  in  the  decision  making  process.  (This  section,  for  the  sake  of  definiteness,  is 
presented  as  if  X2  were  making  inference  based  on  information  from  Xi.) 

Given  (u)  and  (u),  consider  the  joint  options  in  Ci.  If  (u)  >  (u)  and  u  =  u{i) 

for  some  u  G  Ci,  then  the  compromise  set  provides  no  information:  it  reflects  decisions  that  would 
be  made  by  using  its  estimates.  Also,  if  p\{u)  <  qip\^{u)  and  u  ^  C>i{i)  then  no  additional 
information  is  provided:  X2  did  not  expect  the  choice,  and  Xi  did  not  select  it. 

However,  if  p\{u)  >  qiPR^{u)  and  u  ^  Ci{i)  then  Xi  has  rejected  option,  both  individually 
and  jointly,  which  according  to  X2’s  model  it  should  have  accepted.  Furthermore,  if  p\{u)  < 
qiPRii'^)  but  u  G  Ci{i),  then  Xi  has  accepted  options  both  individually  and  jointly  which,  accord¬ 
ing  to  X2’s  model  it  should  have  rejected.  Both  of  these  circumstances  evince  that  X2’s  model 
PRii^)  is  inaccurate  at  u  and  stands  updating.  Our  inference  rule  is  to  change  the  rejectability 
Pri  (^)  ill  such  a  way  that  these  inconsistencies  are  resolved,  and  in  such  a  way  that  the  change  at 
each  point  is  minimized  while  ensuring  that  the  probability  constraint  is  satisfied. 

Define  the  sets 

_  {  pIM)  <  ^IiPrM)  and  u  ^  Ci{i) 

U  =  I  u  e  Ui:  or 

[  (u)  >  qip%^  (u)  and  u  e  Ci{i) 

u  =  {ue  Up.  pI^{u)  <  qiPR,{u)  and  u  G  Ci{i)] 

U  =  [ue  Ui:p\{u)  >  qiPR^{u)  and  m  ^  Ci(i)} 

Elements  in  U  have  rejectability  consistent  with  Ci.  Elements  in  (7  have  rejectability  too  high  to 

be  consistent  with  Ci,  while  elements  in  U  have  rejectability  too  low.  Eor  u  E  tj  ox  u  E  U,  form 
updated  rejectability  functions  by 

Pll,new(w)  = 


where 


a{u) 


Psi  U)-<^ 
(“) 


uetj 

uetJ 


for  some  small  positive  e.  This  introduces  a  net  change  in  rejectability 


A=  ^  PRi{u){1  -  a{u)) 
ueuuu 

which  is  to  be  distributed  among  the  rejectabilities  of  the  elements  in  Ui  with  the  smallest  change 
possible  without  affecting  the  decision  boundaries.  Eet 

U>  =  {u  e  U:  pI^{u)  >  qip%{u)} 

and  let 

U<  =  {u  e  U:  pI^{u)  <  qAiu)}. 

The  notation  \  U\  denotes  the  number  of  elements  in  the  set  U.  Then  the  redistribution  is  as  follows: 
If  A  >  0  (i.e.,  rejectability  is  added): 
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•  Distribute  A  among  [/  U  f/  as  equally  as  possible,  such  that  in  [/<  U  f/, 

in  U>,pI^{u)  >  giPK,,new(w)- 

If  A  <  0  (i.e.,  rejectability  is  removed): 

•  Distribute  A  among  U  Li  U  as  equally  as  possible,  such  that  in  U>  Li  U  new(“)  >  0 
in  U<,pI^{u)  < 

This  simple-minded  inference  does  not  fully  exploit  the  information  available.  For  example, 
if  by  Cl  Xi  appears  uninterested  in  some  resource,  X2  could  parameterize  an  increase  its  interest 
in  that  area  either  by  lowering  a  rejectability  with  respect  to  its  acquisition,  or  by  increasing  its 
selectability.  However,  the  inference  described  above  suffices  to  demonstrate  the  concept. 

4.5.4  Numerical  demonstration 

Consider  a  country  with  four  regions  as  shown  in  figure  4.3.  Values  are  apportioned  in  such  a 
way  that  adjacent  regions  have  a  value  greater  than  the  sum  of  the  constituent  areas,  and  that  value 
increases  with  more  regions,  as  shown  in  table  4.1.  The  cost  of  disputed  regions  is  also  shown 
in  table  4.1  (unnormalized).  Cost  of  acquisition  is  on  a  region-by-region  basis,  as  shown  in  table 
4.2.  The  initial  allocation  is  =  [1110]  (countries  2,  3,  and  4)  and  =  [1101].  Figure 

4.4  illustrates  the  sequence  of  estimated  probabilities  p\i^{u)  and  p\^{u)  for  six  iterations.  (The 
abscissa  represents  the  choice  m  as  a  decimal  representation  of  the  binary  decision  vector.  The 
lines  spaced  within  an  integer  [u^u  +  1)  represent  the  probability  estimates  for  different  iterations 
of  the  negotiation  algorithm.)  After  several  iterations  of  negotiation,  the  compromise  sets  shown 
in  table  4.3  are  obtained,  and  the  joint  accord  set  N  join  in  table  4.4  is  obtained,  where  the  rational 
compromise  is  indicated  with  *.  (The  integers  represent  the  decimal  form  of  the  corresponding 
binary  option  vectors).  The  individually  satisficing  sets  are  =  {14}  and  =  (5, 9, 12, 13}. 
The  final  boldness  reached  is  q  =  (1.3, 1.3),  after  starting  at  qo  =  (1.9, 1.9)  and  decrementing  each 
time  by  Ag  =  0.05.  It  is  interesting  to  note  that  even  after  negotiation,  for  the  given  value/cost 
data,  the  two  agents  end  up  with  disputed  regions,  and  that  the  initial  conditions  still  remain  in 
the  joint  accord  set.  However,  having  gone  through  the  negotiating  process,  while  regions  must  be 
shared,  the  agents  may  feel  that  they  have  “bought  in”  to  this  circumstance,  since  these  options  are 
individually  satisficing. 

4.5.5  Discussion 

Within  this  framework  for  negotiation  there  are  several  observations  that  may  be  made.  In  some 
human  negotiations,  parties  often  repeat  a  position  repetitively,  without  an  apparent  change  of 
state,  until  at  some  point  there  is  an  abrupt  change  in  feasible  options.  The  procedure  represented 
here  provides  a  model  for  such  behavior:  even  when  from  one  iteration  to  the  next  there  might  be 
no  change  in  compromise  sets,  each  agent  is  modifying  its  models  of  the  other  agent,  lowering  its 
caution,  and  potentially  changing  its  rejectability  as  a  function  of  the  number  of  iterations. 
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Figure  4.3:  Four  resources  to  be  distributed 
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u 

Ps,  (“) 
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0 
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0001 

0.0210 

0.0221 
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3 

2 

0010 

0.0252 

0.0182 

3 
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3 

0011 

0.0546 

0.0500 

7 

7 

4 

0100 

0.0420 

0.0318 

6 

5 

5 

0101 

0.0714 

0.0636 

8 

9 

6 

0110 

0.0756 

0.0500 

8 

9 

7 

0111 

0.1092 

0.0818 

15 

12 

8 

1000 

0.0252 

0.0455 

2 

6 

9 

1001 

0.0504 

0.0727 

4 

9 

10 

1010 

0.0546 

0.0636 

4 

9 

11 

1011 

0.0588 

0.0727 

9 

5 

12 

1100 

0.0714 

0.0818 

9 

12 

13 

1101 

0.1008 

0.1045 

13 

14 

14 

1110 

0.1050 

0.0955 

13 

14 

15 

nil 

0.1345 

0.1455 

20 

20 

Table  4.1:  Valuations  for  resource  allocation 


u 

(disp.) 

(undisp.) 

(disp.) 

X^ 

(undisp.) 

0001 

6 

3 

6 

3 

0010 

4 

2 

4 

2 

0100 

4 

2 

6 

3 

1000 

8 

4 

10 

5 

Table  4.2:  Cost  of  acquisition  per  country 
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Figure  4.4:  Sequence  of  estimated  probabilities 
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Table  4.3:  Compromise  sets  after  negotiation 


N 

(iW 

(14,9) 

(14.12) 

(14.13) 


Table  4.4:  Joint  accord  set 


Other  behaviors  such  as  recalcitrance  or  openness  can  be  modeled  depending  on  how  the  bold¬ 
ness  is  changed. 

A  concern  that  may  be  raised  regarding  this  procedure  is  its  computational  complexity.  A 
large  measure  of  the  complexity  arises  due  to  the  computation  of  marginals  in  the  formulation  of 
PRi,R2-  Ths  complexity  can  be  mitigated  somewhat  by  efficient  organization  of  the  computations, 
using,  for  example,  the  factor  graph  approach  described  in  [37].  Another  approach  is  to  absorb  the 
normalization  used  determining  in  psi,S2  PRi,R2  index  of  caution  q.  Once  an  initial 

q  can  be  determined  which  provides  for  meaningful  individual  and  joint  solutions,  the  index  of 
caution  is  adjusted  until  a  group  accord  is  established. 

In  conclusion,  the  multi-agent  satisficing  theory  provides  a  means  of  describing  solutions  which 
are  individually  and  jointly  satisficing  from  the  perspective  of  an  individual  agent  in  the  commu¬ 
nity  of  agents.  We  have  provided  a  definition  of  negotiation,  which  is  the  process  of  working  to 
achieve  accord  among  the  different  agents  with  regard  to  the  solutions  they  they  find  acceptable, 
and  provided  some  algorithms  to  implement  that  process.  To  demonstrate  how  the  theory  may  be 
applied  to  a  multi-agent  problem,  a  resource  allocation  problem  was  presented  in  which  agents  vie 
for  disputed  resources. 


4.6  A  Praxeology  for  Rational  Negotation 

This  section  is  drawn  from  [34]. 

4.6.1  Introduction 

Negotiation  is  a  branch  of  multi- agent  decision  making  that  involves  the  opportunity  for  repeated 
interaction  between  independent  entities  as  they  attempt  to  reach  a  joint  decision  that  is  accept¬ 
able  to  all  participants.  But  unless  the  interests  of  the  decision  makers  are  extremely  compatible, 
achieving  such  a  compromise  will  usually  require  them  to  be  willing  to  consider  lowering  their 
standards  of  what  is  acceptable  if  they  are  to  avert  an  impasse.  For  an  agent  to  consider  lowering 
its  standards,  it  must  be  willing  to  relax  the  demand  for  the  best  possible  outcome  for  itself,  and 
instead  be  willing  to  settle  for  an  outcome  that  is  merely  good  enough,  in  deference  to  the  interests 
of  others.  Defining  what  it  means  to  be  good  enough,  however,  is  much  more  subtle  than  defining 
what  it  means  to  be  optimal,  and  any  such  definition  must  be  firmly  couched  in  and  consistent  with 
the  decision  maker’s  concept  of  rationality. 
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Rational  Choice 

Fundamental  rationality  requires  a  deeision  maker  to  ehoose  between  alternatives  in  a  way  that 
is  eonsistent  with  its  preferenees.  Consequently,  before  a  rational  deeision  is  possible,  a  deeision 
maker  must  have  some  way  to  order  its  preferences. 

Definition  4.6  Let  the  symbols  and  “=”  denote  binary  relationships  meaning  “is  at  least  as 
good  as”  and  “is  equivalent  to,”  respectively,  between  members  of  a  set  A"  =  {x,  y,z,. . The  set 
X  is  totally  ordered  if  relationships  between  elements  of  X  are  reflexive  (x  y  x),  antisymmetric 
(x  y  y  &  y  y  X  X  =  y),  transitive  (x  y  y  ^  y  y  z  x  y  z),  and  linear  (either  x  y  y 
ox  y  y  x'i  x,y  &  X).  If  the  linearity  condition  is  relaxed,  then  the  set  is  partially  ordered.  □ 

Once  in  possession  of  a  preference  ordering,  a  rational  decision  maker  must  employ  general 
principles  that  govern  the  way  the  orderings  are  to  be  used  to  formulate  decision  rules.  Perhaps  the 
most  well-known  principle  is  the  classical  economics  hypothesis  of  [38]  and  [39],  which  asserts 
that  individual  interests  are  fundamental;  i.e.,  that  social  welfare  is  an  aggregation  of  individual 
welfares.  This  hypothesis  leads  to  the  doctrine  of  rational  choice,  the  favorite  paradigm  of  con¬ 
ventional  decision  theory.  Rational  choice  is  based  upon  two  premises. 

P-1  Total  ordering:  a  decision  maker  is  in  possession  of  a  total  preference  ordering  for  all  of  its 
possible  choices  under  all  conditions  (in  multi-agent  settings,  this  includes  knowledge  of  the 
total  orderings  of  all  other  participants). 

P-2  The  principle  of  individual  rationality:  a  decision  maker  should  make  the  best  possible  deci¬ 
sion  for  itself,  that  is,  it  should  optimize  with  respect  to  its  own  total  ordering  (in  multi-agent 
settings,  this  ordering  will  be  influenced  by  the  preferences  of  others). 

A  praxeology,  or  science  of  efficient  action,  is  the  philosophical  underpinning  that  governs  the 
actions  of  a  decision-making  entity.  Conventional  praxeologies  are  founded  on  the  paradigm  of 
rational  choice.  For  single-agent  systems,  this  equates  to  optimization,  which  typically  results  in 
maximizing  expected  utility.  For  multi-agent  systems,  rational  choice  equates  to  equilibration:  a 
joint  decision  is  an  equilibrium  if,  were  any  individual  to  change  its  decision  unilaterally,  it  would 
decrease  its  own  expected  utility.  Rational  choice  has  a  strong  normative  appeal;  it  tell  us  what 
exclusively  self-interested  decision  makers  should  do,  and  is  the  praxeological  basis  for  much  of 
current  artificial  decision  system  synthesis  methodology.  The  ratiocination  for  this  approach,  as 
expressed  by  Sandholm,  is  that  each  decision  maker  should 

maximize  its  own  good  without  concern  for  the  global  good.  Such  self-interest  natu¬ 
rally  prevails  in  negotiations  among  independent  businesses  or  individuals  . . .  Therefore, 
the  protocols  must  be  designed  using  a  noncooperative,  strategic  perspective:  the  main 
question  is  what  social  outcomes  follow  given  a  protocol  which  guarantees  that  each 
agent’s  desired  local  strategy  is  best  for  that  agent — and  thus  the  agent  will  use  it  [40, 

pp.  201,202]. 

This  rationale  is  consistent  with  the  conventional  game-theoretic  notion  that  society  should  not  be 
viewed  as  a  generalized  agent,  or  superplayer,  who  is  capable  of  making  choices  on  the  basis  of 
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some  sort  of  group-level  welfare  function.  So  doing,  [41]  argues,  creates  an  “anthropomorphic 
trap”  of  failing  to  distinguish  between  group  choices  and  group  preferences. 

Anthropomorphisms  aside,  it  is  far  from  obvious  that  exclusive  self  interest  is  the  appropriate 
characterization  of  agent  systems  when  coordinated  behavior  is  desirable.  Granted,  it  is  possible 
under  the  individual  rationality  regime  for  a  decision  maker  to  suppress  its  own  egoistic  preferences 
in  deference  to  others  by  redefining  its  utilities,  but  doing  so  is  little  more  than  a  device  to  trick 
individual  rationality  into  providing  a  response  that  can  be  interpreted  as  unselfish.  Such  an  artifice 
provides  only  an  indirect  way  to  simulate  socially  useful  attributes  of  cooperation,  unselfishness, 
and  altruism  under  a  regime  that  is  more  naturally  attuned  to  competition,  exploitation,  and  avarice. 

Luce  and  Raiffa  summarized  the  situation  succinctly  when  they  observed  that 

general  game  theory  seems  to  be  in  part  a  sociological  theory  which  does  not  include 
any  sociological  assumptions  ...  it  may  be  too  much  to  ask  that  any  sociology  be  de¬ 
rived  from  the  single  assumption  of  individual  rationality  [42,  p.  196]. 

Often,  the  most  articulate  advocates  of  a  theory  are  also  its  most  insightful  critics.  Perhaps  the 
essence  of  this  criticism  is  that  rational  choice  does  not  provide  for  the  ecological  balance  that  a 
society  must  achieve  if  it  is  to  accommodate  the  variety  of  relationships  that  may  exist  between 
agents  and  their  environment.  But  achieving  such  a  balance  should  not  require  fabrication  of  a 
superplayer  to  aggregate  individual  welfare  into  group  welfare.  What  it  may  require,  however,  is 
reconsideration  of  the  claim  that  rational  choice  provides  the  appropriate  praxeology  for  synthe¬ 
sizing  cooperative  social  systems. 

State  of  the  Art 

There  are  many  proposals  for  artificial  negotiatory  systems  under  the  rational  choice  paradigm, 
bounded  in  various  ways  to  account  for  limitations  in  knowledge,  computational  ability,  and  ne¬ 
gotiation  time.  [43]  and  [44]  propose  models  of  alternating  offers;  these  approaches  are  refined 
by  [45]  to  account  for  time  constraints,  and  are  further  developed  by  [46,  47,  48],  [49],  and  [50] 
to  incorporate  a  time  discount  rate  and  to  account  for  incomplete  information  via  the  introduc¬ 
tion  of  a  revelation  mechanism.  These  approaches  are  based  on  a  notion  of  perfect  equilibrium, 
which  is  stronger  than  Nash  equilibrium  in  that  it  requires  that  an  equilibrium  must  be  induced  at  any 
stage  of  the  negotiation  process.  Similar  manifestations  of  bounded  rationality  occur  with  [51], 
who  present  a  general  framework  for  metareasoning  via  decision  theory  to  define  the  utility  of 
computation.  Others  have  followed  these  same  lines  (see,  for  example,  [52],  [53],  and  [54]),  and 
yield  optimal  solutions  according  to  performance  criteria  that  is  modified  to  account  for  resource 
limitations.  Additional  approaches  to  bounded  rationality  occur  with  [55],  who  provide  a  rational 
analysis  framework  that  accounts  for  environmental  constraints  regarding  what  is  optimal  behavior 
in  a  particular  context.  Another  individual  rationality-based  approach  is  to  involve  market  price 
mechanisms,  as  is  done  by  [56,  57],  resulting  in  a  competition  between  agents  in  a  context  of  in¬ 
formation  service  provision.  [58]  use  the  Clarke  Tax  voting  procedure  to  obtain  the  highest  sum 
of  utilities  in  an  environment  of  truthful  voting.  [59]  present  a  method  of  “principled  negotiation” 
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involving  proposed  changes  to  an  original  master  plan  as  a  means  of  finding  a  distributed  optimal 
negotiated  solution. 

Another  stream  of  research  for  the  design  of  negotiatory  systems  is  to  rely  more  heavily  upon 
heuristics  than  upon  formal  optimization  procedures.  The  approach  taken  by  Rosenschein  and 
Zlotkin  is  to  emphasize  special  compromise  protocols  involving  pre-computed  solutions  to  specific 
problems  [60,  61,  62,  63].  Formal  models  which  describe  the  mental  states  of  agents  based  upon 
representations  of  their  beliefs,  desires,  intentions,  and  goals  can  be  used  for  communicative  agents 
[64,  65,  66,  67,  68,  69,  70].  In  particular,  Sycara  develops  a  negotiation  model  that  accounts  for 
human  cognitive  characteristics,  and  models  negotiation  as  an  iterative  process  involving  case- 
based  learning  and  multi-attribute  utilities  [71,  72].  [73]  provide  logical  argumentation  models  as 
an  iterative  process  involving  exchanges  among  agents  to  persuade  each  other  and  bring  about  a 
change  of  intentions.  [74,  75]  develop  a  negotiation  framework  that  employs  a  Bayesian  belief 
update  learning  process  through  which  the  agents  update  their  beliefs  about  their  opponent.  [76] 
advance  a  notion  of  partial  global  planning  for  distributed  problem  solving  in  an  environment  of 
uncertainty  regarding  knowledge  and  abilities. 

The  above  approaches  offer  realistic  ways  to  deal  with  the  exigencies  under  which  decisions 
must  be  made  in  the  real  world.  They  represent  important  advances  in  the  theory  of  decision  mak¬ 
ing,  and  their  importance  will  increase  as  the  scope  of  negotiatory  decision  making  grows.  They 
all  appear,  however  to  have  a  common  theme,  which  is,  that  if  a  decision  maker  could  maximize 
its  own  private  utility  subject  to  the  constraints  imposed  by  other  agents,  it  should  do  so.  Exclusive 
self-interest  is  a  very  simple  concept.  It  is  also  a  very  limiting  concept,  since  it  justifies  ignoring 
the  preferences  of  others  when  ordering  one’s  own  preferences.  The  advantage  of  invoking  exclu¬ 
sive  self-interest  is  that  it  may  drastically  reduce  the  complexity  of  a  model  of  the  society.  The 
price  for  doing  so  is  the  risk  of  compromising  group  interests  when  individual  preferences  domi¬ 
nate,  or  of  distorting  the  real  motives  of  the  individuals  when  group  interests  dominate.  The  root 
of  the  problem,  in  both  of  these  extreme  cases,  is  the  lack  of  a  way  to  account  for  both  group  and 
individual  interests  in  a  seamless,  consistent  way. 


Middle  Ground 

Rather  than  searching  for  or  approximating  a  narrowly  defined  theoretical  ideal,  an  alternative 
is  to  focus  on  an  approach  that,  even  though  it  may  not  aspire  to  such  an  ideal,  is  ecologically 
tuned  to  the  environment  in  which  the  agents  must  function.  If  it  is  to  function  in  a  coordinative 
environment,  it  should  not  ignore  the  possibility  of  distinct  group  interests,  yet  it  must  respect 
individual  interests.  It  should  be  flexible  with  respect  to  evaluations  of  what  is  acceptable,  yet  it 
must  not  abandon  all  qualitative  measures  of  performance.  Kreps  seems  to  be  seeking  such  an 
alternative  when  he  observes  that 

. . .  the  real  aeeomplishment  will  eome  in  fi  nding  an  interesting  middle  ground  between  hy- 
perrational  behaviour  and  too  much  dependence  on  ad  hoc  notions  of  similarity  and  strategic 
expectations.  When  and  if  such  a  middle  ground  is  found,  then  we  may  have  useful  theories 
for  dealing  with  situations  in  which  the  rules  are  somewhat  ambiguous  [77,  p.  184]. 
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Is  there  really  some  middle  ground,  or  is  the  lacuna  between  strict  rational  choice  and  pure 
heuristics  bridgeable  only  by  forming  hybrids  of  these  extremes?  If  non-illusory  middle  ground 
does  exist,  few  have  staked  claims  to  that  turf.  Literature  involving  rational  choice  (bounded  or  un¬ 
bounded)  is  overwhelmingly  vast,  reflecting  many  decades  of  serious  study.  Likewise,  heuristics, 
rule-based  decision  systems,  and  various  ad  hoc  techniques  are  well-represented  in  the  literature. 
Rationality  paradigms  that  depart  from  these  extremes  or  blends  thereof,  however,  are  not  in  sub¬ 
stantial  evidence.  One  who  has  made  this  attempt,  however,  is  Slote  [78],  who  argues  that  it  is 
not  even  necessary  to  define  a  notion  of  optimality  in  order  to  define  a  common  sense  notion  of 
adequacy.  He  suggests  that  it  is  rational  to  choose  something  that  is  merely  adequate  rather  than 
something  that  is  best,  and  that  moderation  in  the  short  run  may  actually  be  instrumentally  optimal 
in  the  long  run.  Unfortunately,  Slote  does  not  metrize  the  notion  of  being  adequate.  It  is  far  easier 
to  quantify  the  the  notion  of  bestness  than  it  is  to  quantify  the  notion  of  adequacy.  Striving  for  the 
best  may  be  the  most  obvious  way  to  use  ordering  information,  but  it  is  not  the  only  way.  This  pa¬ 
per  presents  a  notion  of  adequacy  that  is  not  an  approximation  to  bestness — it  is  a  distinct  concept 
that  admits  a  precise  mathematical  definition  in  terms  of  utility-like  quantities.  The  motivation 
for  pursuing  this  development  is  to  soften  the  strict  egoism  of  individual  rationality  and  open  the 
way  for  consideration  of  a  more  socially  compatible  view  of  rationality  that  does  not  rely  upon 
optimization,  heuristics,  or  hybrids  of  these  extremes. 


4.6.2  A  New  Praxeology 

The  assumption  that  a  decision-maker  possesses  a  total  preference  ordering  that  accounts  for  all 
possible  combinations  of  choices  for  all  agents  under  all  conditions  is  a  very  strong  condition, 
particularly  when  the  number  of  possible  outcomes  is  large.  In  multi-agent  decision  scenarios, 
individuals  may  not  be  able  to  comprehend,  or  even  to  care  about,  a  full  understanding  of  their  en¬ 
vironment.  They  may  be  concerned  mostly  about  issues  that  are  closest  to  them,  either  temporally, 
spatially,  or  functionally.  A  praxeology  relevant  to  this  situation  must  be  able  to  accommodate 
preference  orderings  that  may  be  limited  to  proper  subsets  of  the  community  or  to  proper  subsets 
of  conditions  that  may  obtain. 

In  societies  that  value  cooperation,  it  is  unlikely  that  the  preferences  of  a  given  individual  will 
be  formed  independently  of  the  preferences  of  others.  Knowledge  about  one  agent’s  preferences 
may  alter  another  agent’s  preferences.  Such  preferences  are  conditioned  on  the  preferences  of  oth¬ 
ers.  Individual  rationality  does  not  accommodate  such  conditioning.  The  only  type  of  conditioning 
supported  by  individual  rationality  is  for  each  agent  to  express  its  preferences  conditioned  on  the 
choices  of  the  others  but  not  on  their  preferences  about  their  choices.  Each  agent  then  computes  its 
own  expected  utility  as  a  function  of  the  possible  options  of  all  agents,  juxtaposes  these  expected 
utilities  into  a  payoff  array,  and  searches  for  an  equilibrium.  Although  the  equilibrium  itself  is 
governed  by  the  utilities  of  all  agents,  the  individual  expected  utilities  that  define  the  equilibrium 
do  not  consider  the  preferences  of  others.  A  praxeology  for  a  complex  society,  however,  should 
accommodate  notions  of  cooperation,  unselfishness,  and  even  altruism.  One  way  to  do  this  is  to 
permit  the  preferences  (not  just  the  choices)  of  decision  makers  to  influence  each  other. 


87 


Tradeoffs 


At  present,  there  does  not  appear  to  be  a  body  of  theory  that  supports  the  systematie  synthesis 
of  multi-agent  deeision  systems  that  does  not  rely  upon  the  individual  rationality  premise.  It  is 
a  platitude  that  deeision  makers  should  make  the  best  ehoiees  possible,  but  we  eannot  rationally 
ehoose  an  option,  even  if  we  do  not  know  of  anything  better,  unless  we  know  that  it  is  good 
enough.  Being  good  enough  is  the  fundamental  obligation  of  rational  deeision  makers — being  best 
is  a  bonus. 

Perhaps  the  earliest  notion  of  being  “good  enough”  is  Simon’s  eoneept  of  satisficing.  His  ap- 
proaeh  is  to  blend  rational  ehoiee  with  heuristies  by  speeifying  aspiration  levels  of  how  good  a 
solution  might  reasonably  be  aehieved,  and  halting  seareh  for  the  optimum  when  the  aspirations 
are  met  [22,  79,  80].  But  it  is  diffieult  to  establish  good  and  praotieally  attainable  aspiration  lev¬ 
els  without  first  exploring  the  limits  of  what  is  possible,  that  is,  without  first  identifying  optimal 
solutions — the  very  proeedure  this  notion  of  satisfieing  is  designed  to  eireumvent.  Aspiration  levels 
at  least  superfieially  establish  minimum  requirements,  and  speeifying  them  for  simple  single-agent 
problems  may  be  noneontroversial.  But  with  multi-agent  systems,  interdependeneies  between  de¬ 
eision  makers  ean  beeome  eomplex,  and  aspiration  levels  ean  be  eonditional  (what  is  satisfaetory 
for  me  may  depend  upon  what  is  satisfaetory  for  you).  The  eurrent  state  of  affairs  regarding  aspi¬ 
ration  levels  does  not  address  the  problem  of  speeifying  them  in  multi- agent  eontexts.  It  may  be 
that  what  is  really  needed  is  a  notion  of  satisfieing  that  does  not  depend  upon  arbitrary  aspiration 
levels  or  stopping  rules. 

Let  us  replaee  the  premise  of  individual  rationality  with  a  eoneept  of  being  good  enough  that 
is  distinet  from  being  approximately  best.  Mathematieally  formalizing  a  eoneept  of  being  good 
enough,  however,  is  not  as  straightforward  as  optimizing  or  equilibrating.  Being  best  is  an  absolute 
eoneept — it  does  not  eome  in  degrees.  Being  good  enough,  however,  is  not  an  absolute,  and  does 
eome  in  degrees.  Consequently,  we  must  not  demand  a  unique  good-enough  solution,  but  instead 
be  willing  to  aeeept  varying  degrees  of  adequaey. 

This  paper  proposes  a  notion  for  being  good  enough  that  is  aetually  more  primitive  and  yet  more 
eomplieated  to  quantify  than  doing  the  best  thing  possible.  It  is  a  benefit-eost  tradeoff  paradigm 
of  getting  at  least  what  one  pays  for.  The  reason  it  is  more  eomplieated  to  quantify  is  that  it 
requires  the  applieation  of  two  distinet  metries  to  be  eompared,  whereas  doing  the  best  thing  re¬ 
quires  only  one  metrie  to  be  maximized.  As  a  formalized  means  of  deeision  making,  this  approaeh 
has  appeared  in  at  least  two  very  different  eontexts:  eeonomies  and  epistemology — the  former  is 
intensely  praetieal  and  eonerete,  the  latter  is  intensely  theoretieal  and  abstraet.  Eeonomists  imple¬ 
mented  the  formal  praetiee  of  benefit-eost  analysis  to  evaluate  the  wisdom  of  implementing  flood 
eontrol  polieies  [81].  The  usual  proeedure  is  to  express  all  benefits  and  eosts  in  monetary  units  and 
to  sanetion  a  proposition  if  the  benefits  are  in  exeess  of  the  estimated  eosts.  The  problem  with  this 
eoneept,  however,  is  that  the  individual  interests  are  aggregated  into  a  single  monolithie  interest  by 
eomparing  the  total  benefits  with  the  total  eosts.  Despite  its  flaws,  benefit-eost  analysis  has  proven 
to  be  a  useful  way  to  reduee  a  eomplex  problem  to  a  simpler,  more  manageable  one.  One  of  its 
ehief  virtues  is  its  fundamental  simplieity. 

A  more  sophistieated  notion  of  benefit-eost  appears  in  philosophy.  Building  upon  the  Ameri- 
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can  tradition  of  pragmatism  fostered  by  Peiree,  James,  and  Dewey,  [24]  has  developed  a  distine- 
tive  sehool  of  thought  regarding  the  evolution  of  knowledge  eorpora.  Unlike  the  eonventional 
doetrine  of  expanding  a  knowledge  eorpus  by  adding  information  that  has  been  justified  as  true, 
Levi  proposes  the  more  modest  goal  of  avoiding  error.  This  theory  has  been  detailed  elsewhere 
(see  [24,  82,  33,  31,  26]).  The  gist  is  that,  given  the  task  of  determining  whieh,  if  any,  of  a  set 
of  propositions  should  be  retained  in  an  agent’s  knowledge  corpus,  the  agent  should  evaluate  eaeh 
proposition  on  the  basis  of  two  distinct  criteria — first,  the  eredal,  or  subjeetive,  probability  of  it 
being  true,  and  seeond,  the  informational  value"^  of  rejecting  it,  that  is,  the  degree  to  whieh  dis- 
earding  the  option  foeuses  attention  on  the  kind  of  information  that  is  demanded  by  the  question. 
Thus,  for  an  option  to  be  admissible,  it  must  be  both  believable  and  informative — all  implausible 
or  uninformative  option  should  be  rejeeted.  Levi  eonstruets  an  expected  epistemic  utility  function 
and  shows  that  it  is  the  differenee  between  eredal  probability  and  a  eonstant  (the  index  of  eaution) 
times  another  probability  funetion,  termed  the  informational-value-of-rejeetion  probability.  The 
set  of  options  that  maximizes  this  differenee  is  the  admissible  set. 

Single- Agent  Satisficing 

Levi’s  epistemology  is  to  employ  two  separate  and  distinet  orderings — one  to  eharaeterize  belief, 
the  other  to  eharaeterize  value.  This  approaeh,  originally  developed  for  epistemologieal  deeision- 
making  (eommitting  to  beliefs),  may  easily  be  adapted  to  the  praxeologieal  domain  (taking  aetion) 
by  formulating  praxeologieal  analogs  to  the  epistemologieal  notions  of  truth  and  informational 
value.  A  natural  analog  for  truth  is  success,  in  the  sense  of  aehieving  the  fundamental  goals  of  tak¬ 
ing  aetion.  To  formulate  an  analog  for  informational  value,  observe  that,  just  as  the  management  of 
a  finite  amount  of  relevant  information  is  important  when  inquiring  after  truth  in  the  epistemologi¬ 
eal  context,  taking  effeetive  action  requires  the  management  of  finite  resources,  such  as  conserving 
wealth,  materials,  energy,  safety,  or  other  assets.  An  apt  praxeologieal  analog  to  the  informational 
value  of  rejeetion  is  the  conservational  value  of  rejeetion.  Thus,  the  eontext  of  the  deeision  prob¬ 
lem  changes  from  the  epistemologieal  issue  of  aequiring  information  while  avoiding  error  to  the 
praxeologieal  issue  of  eonserving  resourees  while  avoiding  failure.  To  emphasize  the  eontext  shift, 
the  resulting  utility  funetion  will  be  termed  praxeic  utility. 

Let  us  refer  to  the  degree  of  resouree  eonsumption  as  rejectability  and  require  the  rejeetability 
funetion  to  eonform  to  the  axioms  of  probability.  This  new  terminology  emphasizes  the  semantie 
distinetion  of  using  the  mathematies  of  probability  in  a  non-conventional  way.  Thus,  for  a  finite 
action  space  U,  rejeetability  is  expressed  in  terms  of  a  mass  funetion  pr:  U  [0, 1],  such  that 
Pniu)  >  0  for  all  M  G  U  and  ^  Inefficient  options  (those  with  high  resouree 

eonsumption)  should  be  highly  rejeetable;  that  is,  if  eonsiderations  of  suceess  are  ignored,  one 
should  be  prone  to  rejeet  options  that  result  in  large  costs,  high  energy  eonsumption,  exposure  to 
hazard,  ete.  Normalizing  pr  to  be  a  mass  function,  termed  the  rejectability  mass  function,  insures 
that  the  agent  will  have  a  unit  of  resouree  eonsumption  to  apportion  among  the  elements  of  U.  The 
funetion  pr  is  the  dis-utility  of  eonsuming  resourees;  that  is,  if  m  G  (7  is  rejeeted,  then  the  agent 

"^Informational  value,  as  used  here,  is  distinct  from  the  notion  of  ‘Value  of  information”  of  conventional  decision 
theory,  which  deals  with  the  change  in  expected  utility  if  uncertainty  is  reduced  or  eliminated  from  a  decision  problem. 


89 


conserves  Pr{u)  worth  of  its  unit  of  resources. 

The  degree  that  u  eontributes  toward  the  avoidance  of  failure  is  the  selectability  of  u.  Let  us 
define  the  selectability  mass  function,  ps'.  U  [0, 1]  as  the  normalized  amount  of  suceess  support 
assoeiated  with  eaeh  u  E  U.  Suppose  that  implementing  u  E  U  would  avoid  failure.  For  any 
A  C  U,  the  utility  of  not  rejeeting  A  in  the  interest  of  avoiding  failure  is  the  indieator  funetion 
r  1  if  xjj  ^ 

Ia{u)  =  ^  g  otherwise  '  praxeic  utility  of  not  rejeeting  A  when  u  avoids  failure  is  the 
eonvex  eombination  of  the  utility  of  avoiding  failure  and  the  utility  of  eonserving  resources: 

4>{A,  u)  =  alA{u)  +  (1  -  a)  1 1  -  j  , 

V  veA  J 

where  a  E  [0, 1]  is  ehosen  to  refieet  the  agent’s  personal  weighting  of  these  two  desiderata — setting 
a  =  I  assoeiates  equal  concern  for  avoiding  failure  and  eonserving  resources. 

Generally,  the  deeision-maker  will  not  know  preeisely  whieh  u  will  avoid  failure,  and  so  must 
weight  the  utility  for  eaeh  u  by  the  eorresponding  seleetability,  and  sum  over  U  to  eompute  the 
expeeted  praxeie  utility. 
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Dividing  by  a  and  ignoring  the  eonstant  term  yields  a  more  eonvenient  but  equivalent  form: 

“  (1Pr{u)]  , 

ueA 

where  q  =  The  term  q  is  the  index  of  caution,  and  parameterizes  the  degree  to  whieh  the 
deeision  maker  is  willing  to  aecommodate  inereased  eosts  to  achieve  sueeess.  An  equivalent  way 
of  viewing  this  parameter  is  as  an  index  of  boldness,  eharacterizing  the  degree  to  whieh  the  de¬ 
eision  maker  is  willing  risk  rejeeting  sueeessful  options  in  the  interest  of  eonserving  resourees. 
Nominally,  g  =  1,  whieh  attributes  equal  weight  to  sueeess  and  resource  eonservation  interests. 

Definition  4.7  A  deeision  maker  is  satisficingly  rational  if  it  ehooses  an  option  for  whieh  the 
seleetability  is  greater  than  or  equal  to  the  index  of  eaution  times  rejeetability.  □ 

We  adopt  this  notion  of  satisfieing  as  the  mathematieal  definition  of  being  good  enough.  The 
largest  set  of  satisfieing  options  is  the  satisficing  set: 

Eg  =  argmax(^(A)  =  {m  G  U:ps{u)  >  qpR{u)}.  (4.8) 

Ad-U 

Notiee  that  (4.8)  is  in  the  form  of  a  likelihood  ratio  test,  sinee  the  seleetability  and  rejeetability 
funetions  are  mass  funetions.  Equation  (4.8)  is  the  praxeic  likelihood  ratio  test  (PLRT). 
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This  concept  of  satisficing  does  not  require  that  the  set  of  good-enough  solution  be  non-empty. 
If  it  is  non-empty,  however,  fundamental  eonsisteney  requires  that  the  best  solution,  if  it  exists 
(under  the  same  eriteria),  must  be  a  member  of  that  set. 

Theorem  2  (a)  g  <  1  7^  0.  (b)  7/' Eg  7^  0  then  there  exists  an  optimality  criterion  that  is 

consistent  with  ps  and  pr  such  that  the  optimal  choice  is  an  element  ofT,q. 

Proof  (a)  If  Eg  =  0,thenps(M)  <  qpR{u)\/u  e  [/,andheneel  =  Y.u&uPs{u)  <  qY.u&uPR^'^)  = 
q,  a  eontradietion.  (b)  Define  J{u)  =  ps{u)  —  <1Pr{u),  and  let  u*  =  argmax^^eij  J(m).  But 
J{u)  >  0  V  M  G  Eg,  and  sinee  Eg  7^  0,  J(m*)  >  max^^gs,  J{u)  >  0,  whieh  implies  u*  G  Eg.  □ 

Individual  rationality  requires  that  a  single  ordering  be  defined  for  eaeh  agent,  and  that  all 
of  its  options  be  rank-ordered  with  the  best  one  surviving.  This  is  an  inter-option,  or  extrinsic, 
eomparison,  since  it  requires  the  evaluation  of  an  option  with  respeet  to  quantities  other  than  those 
assoeiated  with  itself  (namely,  ranking  of  all  other  options).  The  PLRT  provides  another  way  to 
order,  using  two  preferenee  orderings:  one  to  eharacterize  the  desirable,  or  seleetable,  attributes  of 
the  options,  while  the  other  eharaeterizes  the  undesirable,  or  rejeetable,  attributes,  and  eompares 
these  two  orderings  for  eaeh  option,  yielding  a  binary  deeision  (rejeet  or  retain)  for  eaeh.  Sueh 
intra-option  eomparisons  are  intrinsic,  sinee  they  do  not  require  the  evaluation  of  an  option  with 
respeet  to  quantities  other  than  those  assoeiated  with  itself.  This  intrinsie  eomparison  identifies 
all  options  for  whieh  the  benefit  derived  from  implementing  them  is  at  least  as  great  as  the  eost 
incurred.  This  notion  of  satisfieing  is  eompatible  with  Simon’s  original  notion  in  that  it  addresses 
exaetly  the  same  issue  that  motivated  Simon — to  identify  options  that  are  good  enough  by  directly 
comparing  attributes  of  options.  This  notion  differs  only  in  the  standard  used  for  eomparison. 
The  standard  for  satisfieing  a  la  Simon,  as  with  individual  rationality  in  general,  is  imposed  from 
without — it  is  extrinsie,  sinee  it  relies  upon  external  information  (the  aspiration  level).  In  eontrast, 
the  standard  for  satisfieing  a  la  the  PLRT  is  set  up  from  within — it  is  intrinsie,  and  eompares  the 
positive  attributes  to  the  negative  attributes  of  eaeh  option. 

Intrinsie  satisfieing  may  be  blended  with  Simon’s  extrinsie  approach  by  specifying  the  aspi¬ 
ration  level  via  the  PLRT,  rather  than  a  fixed  threshold.  Searehing  then  may  stop  when  the  first 
element  of  Eg  is  identified.  On  the  other  hand,  searehing  may  eontinue  to  exhaustion,  and  addi¬ 
tional  ordering  eonstraints  ean  be  imposed  on  the  elements  of  Eg  to  identify  an  optimal  solution 
(for  example,  see  [26]). 

4.6.3  Extension  to  Multiple  Agents 

Individual  satisfieing  is  defined  in  terms  of  univariate  seleetability  and  rejeetability  mass  funetions 
that  provide  separate  orderings  for  suceess  and  resource  eonsumption,  respectively.  Just  as  univari¬ 
ate  probability  theory  extends  to  multivariate  probability  theory,  we  may  extend  single-agent  se¬ 
leetability  and  rejeetability  mass  funetions  to  the  multi-agent  ease  by  defining  a  multi-agent  (joint) 
seleetability  mass  funetion  to  eharaeterize  group  seleetability  and  a  joint  rejeetability  funetion  to 
eharaeterize  group  rejeetability.  Given  sueh  functions,  we  may  define  a  eoneept  of  multi-agent 
satisfieing,  or  jointly  satisfieing,  as  follows: 
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Definition  4.8  A  decision-making  group  is  jointly  satisficingly  rational  if  the  members  of  the 
group  choose  a  vector  of  options  for  which  joint  selectability  is  greater  than  or  equal  to  the  index 
or  caution  times  joint  rejectability.  □ 

For  this  definition  to  be  useful  we  must  be  able  to  construct  the  joint  selectability  and  rejectabil¬ 
ity  functions  in  a  way  that  accommodates  partial  preference  orderings  and  conditional  preferences. 
To  establish  this  utility,  we  first  introduce  the  notion  of  interdependence  and  define  a  satisficing 
game.  We  then  describe  how  the  interdependence  function  can  be  constructed  from  local  orderings, 
leading  to  emergent  total  preference  orderings. 

Interdependence 

An  act  by  any  member  of  a  multi-agent  system  has  possible  ramifications  throughout  the  entire 
community.  Some  agents  may  be  benefited  by  the  act,  some  may  be  damaged,  and  some  may 
be  unaffected.  Furthermore,  although  the  single  agent  may  perform  the  act  in  its  own  interest, 
or  for  the  benefit  (or  detriment)  of  other  agents,  the  act  is  usually  not  implemented  free  of  cost. 
Resources  are  expended,  or  risk  is  taken,  or  some  other  cost,  penalty,  or  unpleasant  consequence 
is  incurred  by  the  agent  itself  or  by  other  agents.  Although  these  undesirable  consequences  may 
be  defined  independently  from  the  benefits,  the  measures  associated  with  benefits  and  costs  cannot 
be  specified  independently  of  each  others  due  to  the  possibility  of  interaction.  A  critical  aspect  of 
modeling  the  behavior  of  such  a  society,  therefore,  is  the  means  of  representing  the  interdependence 
of  both  positive  and  negative  consequences  of  all  possible  joint  actions  that  could  be  undertaken. 

Definition  4.9  Let  {Xi, . . . ,  Xjv}  be  an  X-member  multi-agent  system.  A  mixture^  is  any  subset 
of  agents  considered  in  terms  of  their  interaction  with  each  other,  exclusively  of  possible  interac¬ 
tions  with  other  agents  not  in  the  subset. 

A  selectability  mixture,  denoted  S  =  Si^ ...  Si^,  is  z  mixture  consisting  of  agents  , . . .  Xi^ 
being  considered  from  the  point  of  view  of  success.  The  joint  selectability  mixture  is  the  selectabil¬ 
ity  mixture  consisting  of  all  agents  in  the  system,  denoted  S  =  S'! . . .  S'at. 

A  rejectability  mixture,  denoted  IZ  =  Rj^ . . .  Rj^,  is  a  mixture  consisting  of  agents  X^^, . . .  Xj^ 
being  considered  from  the  point  of  view  of  resource  consumption.  The  joint  rejectability  mixture 
is  the  rejectability  mixture  consisting  of  all  agents  in  the  system,  denoted  R  =  . . .  R^. 

An  intermixture  is  the  concatenation  of  a  selectability  mixture  and  a  rejectability  mixture,  and 
is  denoted  SIZ  =  Si^  . . .  Si^Rj^ . . .  Rj^.  The  joint  intermixture  is  the  concatenation  of  the  joint 
selectability  and  joint  rejectability  mixtures,  and  is  denoted  SR  =  . . .  SmRi  ■  ■  ■  Rn-  LI 

Definition  4.10  Let  Ui  be  the  action  space  for  Xj,  i  =  1, . . . ,  X.  The  product  action  space, 
denoted  U  =  Ui  x  ■  ■  ■  x  Un  is  the  set  of  all  X-tuples  u  =  (mi,  . . . ,  mat)  where  Ui  G  Ui.  The 
selectability  action  space  associated  with  a  selectability  mixture  S  =  Si^  . . .  is  the  product 
space  U5  =  Ui^x  ■  ■  ■  X  Ui^.  The  rejectability  action  space  associated  with  a  rejectability  mixture 
TZ  =  Rj^ . . .  Rj^  is  the  product  space  Ut^  =  x  ■  ■  ■  x  Uj^.  The  interaction  space  associated 
with  an  intermixture  STZ  =  ■  ■  ■  Si^Rj^  ■  ■  ■  Rj^  is  the  product  space  =  U^xUt^  = 

Ui^  X  ■  ■  ■  X  Uif,  X  Uj^  X  ■  ■  ■  X  Uj^.  The  joint  interaction  space  is  Usr  =  U  x  U.  □ 

^Not  to  be  confused  with  a  mixture  of  distributions,  which  is  a  convex  combination  of  probability  distributions. 
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Definition  4.11  A  selectability  mass  function  (smf)  for  the  mixture  S  =  {5'*^ ...  is  a  mass 
funetion  denoted  ps  =  PSi^  ,-,Si^  ■  U5  — [0, 1].  The  joint  smf  is  an  smf  for  S,  denoted  ps- 

A  rejectability  mass  function  (rmf)  for  the  mixture  TZ  =  {Rj^ . . .  ,Rj^}  is  a  mass  function 
denoted  pji  =  :  Ut^  — [0, 1].  The  joint  rmf  is  a  rmf  for  R,  denoted  pr. 

An  interdependence  mass  function  (IMF)  for  the  intermixture  SIZ  =  {5'*^ . . .  Si^Rj^  . . .  Rj^,}  is 
a  mass  function  denoted  psR  =  PSi^,...,Sii^,Rj-^,...,Rjf  U5  x  Ut^  — [0, 1].  The  joint  IMF  is  an  IMF 
for  SR,  denoted  psR-  n 

Let  V  G  U5  and  w  G  Ut^  be  two  option  vectors.  Then  P5,7^(v,  w)  is  a  representation  of  the 
success  support  associated  with  v  and  the  resource  consumption  associated  with  w  when  the  two 
option  vectors  are  viewed  simultaneously.  In  other  words,  P5,7^(v,  w)  is  the  mass  associated  with 
selecting  v  in  the  interest  of  success  and  rejecting  w  in  the  interest  of  conserving  resources. 

Satisficing  Games 

The  interdependence  function  incorporates  all  of  the  information  relevant  to  the  multi-agent  deci¬ 
sion  problem.  From  this  function  we  may  derive  the  joint  selectability  and  rejectability  marginals 
as 

ps(u)  =  ^Ps,r(u,  v)  (4.9) 

veu 

Pr(v)  =  ^  Ps,r(u,  v)  (4. 10) 

ueu 

for  all  (u,  v)  G  U  x  U.  Once  these  quantities  are  in  place,  a  satisficing  game  can  be  formally 
defined. 

Definition  4.12  A  satisficing  game  for  a  set  of  decision  makers  {Xi, . . . ,  X^},  is  a  triple  (U,  Ps,Pii}, 
where  U  is  a  joint  action  space,  ps  is  the  joint  selectability  function,  and  pr  is  the  joint  rejectability 
function.  The  joint  solution  to  a  satisficing  game  with  index  of  caution  q  is  the  set 

Eg  =  {u  G  U:  Ps(u)  >  gpR(u)}.  (4.11) 

Eg  is  termed  the  joint  satisficing  set,  and  elements  of  Eg  are  jointly  satisficing  actions.  Equation 
(4.1 1)  is  the  joint  praxeic  likelihood  ratio  test  (JPLRT).  □ 

The  JPLRT  establishes  group  preferences  and  identifies  the  joint  option  vectors  that  are  satis¬ 
ficing  from  the  group  perspective.  The  marginal  selectability  and  rejectability  mass  functions  for 
each  Xi  may  be  obtained  from  (4.9)  and  (4.10),  yielding: 

PSi{'^i)=  ^  (4.12) 

Uj&Uj 

PRiiUi)  =  ^  .  .  .  ,MAr).  (4.13) 

Uj&Uj 
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Definition  4.13  The  individual  solutions  to  the  satisfieing  game  {U,  Ps,Pr}  are  the  sets 


E*  =  {Ui  e  Up  Psiiui)  >  qPRiiui)},  (4.14) 

where  psi  and  pn^  are  given  by  (4.12)  and  (4.13),  respeetively,  for  i  =  1, . . . ,  iV.  The  product  of 
the  individually  satisficing  sets  is  the  satisficing  rectangle: 

(Hg  =  Sj  X  ■  ■  ■  X  =  {(mi,  . . . ,  mjv):  e  E* }. 


□ 

It  remains  to  determine  the  relationship  between  the  jointly  satisficing  set  Eg  and  the  individ¬ 
ually  satisficing  sets,  E* ,  i  =  1, . . . ,  N.  Unfortunately,  it  is  not  generally  true  that  either  Eg  C  (Kg 
or  93g  C  Eg.  The  following  result,  however,  is  very  useful. 

Theorem  3  (The  Negotiation  Theorem)  Ifui  is  individually  satisficing  for  Xi,  that  is,  ifui  e  E*, 
then  it  must  be  the  ith  element  of  some  jointly  satisficing  vector  u  G  Eg. 

Proof  This  theorem  is  proven  by  establishing  the  contrapositive,  namely,  that  if  ui  is  not  the 
ith  element  of  any  u  G  Eg,  then  Ui  ^  E^.  Without  loss  of  generality,  let  i  =  1.  By  hypothesis, 
Ps{uiX)  <  gpR(Mi,v)forallv  G  U2X-  ■  -xUn,  so psfui)  =  EvPs(mi,v)  <  9EvPr(“iW)  = 
hence  Ml  ^  Ej.  □ 

The  content  of  this  theorem  is  that  no  one  is  ever  completely  frozen  out  of  a  deal — every 
decision  maker  has,  from  its  own  perspective,  a  seat  at  the  negotiating  table.  This  is  perhaps  the 
weakest  condition  under  which  negotiations  are  possible.  If  Eg  fl  93g  is  empty,  then  there  are  no 
jointly  satisficing  options  that  are  also  individually  satisficing  for  all  players  for  the  given  value  of 
q.  The  following  corollary,  whose  proof  is  trivial  and  is  omitted,  addresses  this  situation. 

Corollary  1  There  exists  an  index  or  caution  value  go  ^  [0, 1]  such  that  Eg^  fl  (Hgg  f  0. 

Thus,  if  the  players  are  each  willing  to  lower  their  standards  sufficiently  by  decreasing  the  index  of 
caution,  g,  they  may  eventually  reach  a  compromise  that  is  both  jointly  and  individually  satisficing, 
according  to  a  reduced  level  of  what  it  means  to  be  good  enough.  The  parameter  go  is  a  measure 
of  how  much  they  must  be  willing  to  compromise  to  avoid  an  impasse.  Note  that  willingness  to 
lower  one’s  standards  is  not  total  capitulation,  since  the  participants  are  able  to  control  the  degree 
of  compromise  by  setting  a  limit  on  how  small  of  a  value  of  g  they  can  tolerate.  Thus,  a  controlled 
amount  of  altruism  is  possible  with  this  formulation.  But,  if  any  player’s  limit  is  reached  without 
a  mutual  agreement  being  obtained,  the  game  has  reached  an  impasse. 

It  may  be  observed  that  the  negotiation  theorem  does  not  provide  for  solutions  which  are  both 
individually  and  jointly  satisficing  for  all  agents.  This  requires  separate  efforts  at  coordination  in 
an  active  process  of  working  toward  an  accord.  This  process  is  explored  in  [21]. 


94 


Synthesis 


The  joint  IMF  provides  a  eomplete  deseription  of  the  individual  and  interagent  relationships  in 
terms  of  their  positive  and  negative  eonsequenees,  and  provides  a  total  ordering  for  both  seleetabil- 
ity  and  rejeetability  for  the  entire  eommunity  as  well  as  for  eaeh  individual.  Basing  a  praxeology 
on  the  IMF  does  not,  at  first  glance,  however,  appear  to  conform  to  the  requirement  to  accom¬ 
modate  partial  orderings,  but  first  glances  can  be  misleading.  Fortunately,  the  IMF,  based  as  it  is 
on  the  mathematics  of  probability  theory,  can  draw  upon  a  fundamental  property  of  that  theory, 
namely,  the  law  of  compound  probability,  to  simplify  its  construction. 

The  law  of  compound  probability  says  that  joint  probabilities  can  be  constructed  from  condi¬ 
tional  probabilities  and  marginal  probabilities.  For  example,  we  may  construct  a  joint  probability 
mass  function  y)  from  the  conditional  mass  function  px\Yix\y)  and  the  marginal  priy)  ac¬ 

cording  to  Bayes  rule,  yielding  pxy{x,  y)  =  Px\Y{x\y)pY{y)-  This  relationship  may  be  extended 
to  the  general  multivariate  case  by  repeated  applications,  yielding  what  is  often  termed  the  chain 
rule. 

Definition  4.14  Given  an  intermixture  SIZ  =  S'jj  . . .  Sif.Rj^  . . .  Rj^,  a  subintermixture  of  SIZ  is 
an  intermixture  formed  by  concatenating  subsets  of  S  and  TZ\  SiRi  =  . . .  R,^^, 

where  {v, . . . ,  ipj  C  {R, . . . ,  4}  and  ...  ,4,}  C  {ji, . . . ,  jj.  The  notation  SiTli  C  STl 
indicates  that  SilZi  is  a  subintermixture  of  SIZ. 

The  SlZ-complementary  subintermixture  associated  with  a  subintermixture  SilZi  of  an  inter¬ 
mixture  SIZ,  denoted  SlZ\SilZi,  is  an  intermixture  created  by  concatenating  the  selectability  and 
rejeetability  mixtures  formed  by  the  relative  compliments  of  and  IZi.  Clearly,  SRX^ilZi  C  SIZ. 
STZ  is  the  union  of  STZ\Si7Zi  and  denoted  S7Z  =  STZ\SiTZi  U  □ 

Definition  4.15  Let  S7Z  be  an  intermixture  with  subintermixture  SiTZi.  A  conditional  interdepen¬ 
dence  mass  function,  denoted  P57^\Sl7^l|5l7^l,  is  a  mapping  of  into  [0, 1]  such 

that,  for  every  v  G  U5l7^^,  (-Iv)  is  a  mass  function  on  n 

All  conditional  interdependence  mass  functions  must  be  be  consistent  with  interdependence 
mass  functions.  That  is,  for  STZ  an  arbitrary  intermixture  with  subintermixture  SiTZi  with  w  E 
S7Z\Si7Zi  and  v  G  SiTZi,  Bayes  rule  requires  that 

PS, 7z(v,w)  =  P57^\Sl7^l|5l7^l(w|v)  -pSiTZiM-  (4.15) 

This  is  the  chain  rule  applied  to  intermixtures.  Repeated  applications  of  the  chain  rule  pro¬ 
vides  a  way  to  construct  global  behavior  from  local  behavioral  relationships.  To  illustrate,  let 
{Xi,X2,X3}  be  a  multi-agent  system  and  let  S  =  S1S2  and  7Z  =  R3.  Then  STZ  =  S1S2R3  and 
SR\Sn  =  S3R1R2.  The  IMF  is 


PSi,S2,S3,Ri,R2,R3i'^l^  "^2,  '^3,  Wl,  W2,  W3)  — 

PS3,Ri,R2\Si,S2,R3i'^3,  Wi,  W2\Vi,V2,  W3)  ■  P5i,S2,/?3  '^2,  ^3)- 

Now  let  be  a  subintermixture  of  S1S2R3,  so  that  STZ\Si  =  S2R3.  We  may  apply  the 

chain  rule  to  this  subintermixture  to  obtain 

PSi,S2,R3i'^UV2,W3)  =  PSi\S2,R3i'^i\'^2,W3)  '  PSj.Rs  (^^2,  W3), 
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yielding 


PSi,S'2,53,-Ri,-R2,R3('^1)  '^2,  V2,,Wi,W2iWz)  — 

PS3,Ri,-R2|S'i,S'2,-R3('^3,  W2\vi,  V2,  W3) 

■  PSi\S2,R3iVl\v2,W3)  ■  PS2,R3iv2:W3).  (4.16) 

The  term  P53  j:j2|5i,S2,R3('^3)  ^1)  ^^2,  W3)  is  the  conditional  selectability/rejectability  asso¬ 

ciated  with  X3  selecting  ^3,  Xi  rejecting  wi,  and  X2  rejecting  W2,  given  that  Xi  prefers  to  select 
vi,  X2  prefers  to  select  V2,  and  X3  prefers  to  reject  w^;  Psi\S2,R3i'^i\'^2:'W3)  characterizes  Xi’s  se- 
lectability  for  vi  given  X2  prefers  to  select  V2  and  X3  prefers  to  reject  W3;  and  P52,J?3(^2,  ws)  is  the 
joint  selectability/rejectability  of  X2  selecting  V2  and  X3  rejecting  W3.  The  various  terms  of  this 
factorization  may  often  be  simplified  further.  For  example,  suppose  that  Xi  is  indifferent  to  X3’s 
rejectability  posture,  in  which  case  we  may  simplify  Psi|52,i?3(^i|'^2,  W3)  to  become  P5^|52(t^i|'y2)- 
Clearly,  there  are  many  ways  to  factor  the  interdependence  function  according  to  the  chain  rule. 
The  design  issue,  however,  is  to  implement  a  factorization  that  allows  the  desired  local  interdepen¬ 
dencies  to  be  expressed  through  the  appropriate  conditional  interdependencies.  The  construction 
of  the  interdependence  function  is  highly  application  dependent,  and  there  is  no  general  algorithm 
or  procedure  that  a  designer  should  follow  for  its  synthesis.  There  are,  however,  some  general 
guidelines  for  the  construction  of  interdependence  functions. 

1.  Form  operational  definitions  of  selectability  and  rejectability  for  individuals  or  groups,  as 
appropriate  from  the  context  of  the  problem. 

2.  Identify  the  local  orderings  that  are  desirable,  and  map  these  into  conditional  selectability 
and  rejectability  functions. 

3.  Factor  the  interdependence  function  such  that  the  desired  conditional  selectability/rejectability 
relationships  are  products  in  the  factorization. 

4.  Eliminate  all  irrelevant  interdependencies  in  the  factors. 

Meso-Emergence 

Although  each  of  the  conditional  mass  functions  in  the  factorization  of  the  interdependence  func¬ 
tion  is  a  total  ordering,  it  is  a  local  total  ordering,  and  involves  only  a  subset  of  agents  and  concerns. 
Each  of  these  local  total  orderings  is  only  a  partial  ordering,  however,  if  viewed  from  the  global, 
or  community-wide,  perspective,  since  orderings  are  not  defined  for  all  possible  option  vectors. 
By  combining  such  local  total  orderings  together  according  to  the  chain  rule,  a  global  total  order¬ 
ing  emerges.  The  joint  selectability  and  rejectability  mass  functions  then  characterize  emergent 
global  behavior,  and  the  individual  selectability  and  rejectability  marginals  characterize  emergent 
individual  behavior.  Thus,  both  individual  and  group  behavior  emerge  as  consequences  of  local 
conditional  interests  that  propagate  throughout  the  community  from  the  interdependent  local  to  the 
interdependent  global  and  from  the  conditional  to  the  unconditional. 
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Synthesizing  the  IMF  exploits  an  emergenee  property  that  is  quite  different  from  the  tempo¬ 
ral,  or  evolutionary,  emergenee  that  ean  oeeur  with  repeated  play  games.  To  differentiate  these 
two  types  of  emergenee,  let  us  refer  to  the  former  as  spatial  emergenee.  Temporal  emergenee 
is  an  inter-game  phenomenon  that  produces  relationships  between  agents  with  repeated  play  as 
time  propagates,  and  spatial  emergence  is  an  intra-game  phenomenon  that  produces  relationships 
between  agents  as  interests  propagate  through  the  agent  system  with  single-play.  Perhaps  the 
most  common  example  of  spatial  emergence  is  the  micro -to -macro,  or  bottom-up  phenomenon  of 
group  behavior  emerging  as  a  consequence  of  individual  interests,  as  occurs  with  social  choice 
theory  [83,  40]  and  with  evolutionary  games  [84,  85].  A  second  approach  is  a  macro-to-micro  or 
top-down  approach,  where  individual  behaviors  emerge  as  a  consequence  of  group  interests.  Satis¬ 
ficing  praxeology  accommodates  both  of  these  approaches.  It  also  points  to  a  third  approach,  that 
of  an  inside-out,  or  meso-to-micro/macro  view,  where  intermediate-level  conditional  preferences 
propagate  up  to  the  group  level  and  down  to  the  individual  level.  Let  us  term  this  type  of  spatial 
emergence  meso-emergence. 

The  conditional  selectability  and  rejectability  mass  functions  are  constructed  as  functions  of 
the  preferences  of  the  other  agents.  For  example,  the  local  total  ordering  function  Psi\S2i'\'V2) 
characterizes  Xi’s  ordering  of  its  selectability  preferences  given  that  X2  prefers  V2.  This  structure 
permits  Xi  to  ascribe  some  weight  to  X2’s  interests  without  requiring  Xi  to  abandon  its  own 
interests  in  deference  to  X2.  By  adjusting  these  weights,  Xi  may  control  the  degree  two  which  it 
is  willing  to  compromise  its  egoistic  values  to  accommodate  X2. 

4.6.4  Discussion 

The  group  decision  problem  has  perplexed  researchers  for  decades.  As  [86,  pp.  233-237]  put  it 
over  thirty  years  ago,  “I  find  myself  in  that  uncomfortable  position  in  which  the  more  I  think  the 
more  confused  I  become.”  The  source  of  Raiffa’s  concern,  it  seems,  is  that  it  is  difficult  to  reconcile 
the  notion  of  individual  rationality  with  the  belief  that  “somehow  the  group  entity  is  more  than  the 
totality  of  its  members.”  Yet,  researchers  have  steadfastly  and  justifiably  refused  to  consider  the 
group  entity  itself  as  a  decision-making  superplayer. 

Satisficing  game  theory  offers  a  way  to  account  for  the  group  entity  without  the  fabrication 
of  a  superplayer.  This  accounting  is  done  through  the  conditional  relationships  that  are  expressed 
through  the  interdependence  function  due  to  its  mathematical  structure  as  a  probability  (but  not 
with  the  usual  semantics  of  randomness).  Just  as  the  a  joint  probability  function  is  more  than  the 
totality  of  the  marginals,  the  interdependence  function  is  more  than  the  totality  of  the  individual 
selectability  and  rejectability  functions.  It  is  only  in  the  case  of  stochastic  independence  that  a 
joint  distribution  can  be  constructed  from  the  marginal  distributions,  and  it  is  only  in  the  case  of 
complete  inter-independence  that  group  welfare  can  be  expressed  in  terms  of  the  welfare  of  the 
individuals. 

The  current  literature  on  negotiation  concentrates  heavily  on  ways  to  obtain  just-in-time  nego¬ 
tiated  solutions  that  can  be  accomplished  within  real-time  computational  constraints,  but  it  does 
so  primarily  from  the  point  of  view  of  individual  rationality.  There  is  no  reason,  however,  to  limit 
consideration  to  that  perspective.  This  paper  is  an  invitation  to  expand  to  a  broader  perspective. 
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and  consider  dealing  with  the  exigencies  of  practical  decision  making  in  the  light  of  satisficing 
game  theory  as  well  as  with  conventional  theory. 

Negotiation  under  (bounded  or  unbounded)  rational  choice  requires  the  decision  maker  to  at¬ 
tempt  to  maximize  its  own  benefit.  This  is  a  valid,  and  perhaps  the  only  reliable,  paradigm  in 
extremely  conflictive  environments,  such  as  zero-sum  games,  but  when  the  opportunity  for  coop¬ 
eration  exists,  the  rational  choice  paradigm  is  overly  pessimistic  and  unnecessarily  limits  the  scope 
of  negotiation. 

The  appeal  of  optimization,  no  matter  now  approximate,  is  a  strongly  entrenched  attitude  that 
dominates  current  decision  making  practice.  There  is  great  comfort  in  following  traditional  paths, 
especially  when  those  paths  are  founded  on  such  a  rich  and  enduring  tradition  as  rational  choice 
affords.  But  when  synthesizing  an  artificial  negotiatory  system,  the  designer  has  the  opportunity 
to  impose  upon  the  agents  a  more  socially  accommodating  paradigm.  The  satisficing  game  the¬ 
ory  presented  in  this  paper  provides  a  sociological  decision-making  mechanism  that  seamlessly 
accounts  for  group  and  individual  interests,  and  provides  a  rich  framework  for  negotiation  to  occur 
between  agents  who  share  common  interests  and  who  are  willing  to  give  deference  to  each  other. 
Rather  than  depending  upon  the  non-cooperative  equilibria  defined  (even  if  only  approximately)  by 
individual-benefit  saddle  points,  this  alternative  may  lead  to  the  more  socially  realistic  and  valuable 
equilibria  of  shared  interests  and  acceptable  compromises. 


4.7  A  Market  Approach  to  Coordination 

In  this  section,  we  present  concepts  that  attempt  to  relate  the  praxeological  approach  with  market 
dynamics.  The  market  concepts  are  derived  following  ideas  of  [87]. 

An  economy  consisting  of  needs  and  abilities  to  satisfy  those  needs.  Perhaps  we  can  character¬ 
ize  both  as  “goods”.  Some  agents  will  bring  as  their  goods  an  excess  of  needs,  which  they  desire 
to  trade  for  the  abilities  of  someone  else.  Others  will  bring  an  excess  of  abilities,  which  they  desire 
to  trade  for  the  needs  of  someone  else.  Somehow  price  needs  to  fit  into  all  of  this. 

In  the  economy  of  interest  here,  we  envision  two  classes  of  “goods.”  Let  V  =  {di,  ((2,  •  •  • ,  d^  } 
denote  a  class  of  which  may  be  brought  to  bear  by  some  agents,  and  let  5  =  {si,  S2, . . . ,  Snj} 

denote  a  class  of  supplies  which  may  be  brought  to  bear  by  some  agents.  There  are  thus  n  =  ni+n2 
types  of  goods. 

In  the  economy  of  supplies  and  demands,  demands  can  be  met  by  certain  equivalences  in 
supplies.  For  example,  we  might  have 


di  —  2si  -f  3s2, 

so  that  a  single  unit  of  demand  of  type  di  is  met  by  2  units  of  supply  si  and  3  units  of  supply  S2. 
We  assume  that  there  is  a  linear  constituitive  relationship 

d  =  Bs. 

Assume  that  there  are  M  agents  in  the  system.  Agent  i  is  provided  with  the  allocation  of  goods 
w*  =  [w\,W2-, . . . ,  ,  where  tc*  is  the  amounts  of  good  j  possessed  by  agent  i. 
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Each  agent  is  also  provided  with  a  utility  funetion  /*  (w*) :  M”  — M.  (As  development  proeeeds, 
we  may  want  to  include  both  seleetability  and  rejectability  utility  functions.)  The  goal  is  to  enter 
into  market  negotiations  so  that  agents  are  able  to  increase  their  utility  by  exehanging  their  own 
endowment  of  goods  w*  for  another  endowment  of  goods.  Let  x*  =  . . . ,  be  the  veetor 

of  goods  of  agent  i  after  a  round  of  market  trading. 

In  this  eeonomy,  there  is  also  a  price  vector  p  =  [pi,  p2,  •  •  • , arrived  at  largely  by  market 
forees,  which  determines  how  goods  are  traded.  The  amount  that  agent  i  stands  to  reeeive  for  his 
allotment  of  goods  is 

n 

m  =  y^PjWj  =  p  w  . 
i=i 

Thus  m*  is  the  amount  of  budget  available  to  agent  i  for  dealings  in  the  market.  [We  may  also 
find  it  convenient  to  endow  eaeh  agent  with  some  another  source  of  budget  which  is  not  tied  to 
any  good,  i.e.,  money.  This  eould  be  used  to  give  them  greater  flexibility  in  the  market.]  To  keep 
within  budget,  we  must  have 

p^x*  <  p^w* 

(money  spent  does  not  exeeed  money  available). 

Let  Wj  =  the  total  amount  of  good  j  available  in  the  market  among  all  agents,  and 

let  w  =  [wi,W2, . . . ,  Wn]'^-  Then,  since  an  agent  eannot  purchase  more  than  is  available,  we  must 
have 

0  <  X*  <  w. 

Let  X  =  [x^;x^; . . .  ;x^]  G  denote  the  total  veetor  of  goods  after  market.  The  market 
determines  the  veetor  [x;  p]^  G 

Under  a  state  of  competitive  equilibrium,  everyone  is  satisfied  to  some  degree.  The  problem  of 
eompetitive  equilibrium  ean  be  stated  as: 

Lor  i  =  1,2, . . . ,  M,  for  a  priee  veetor  p  =  p,  find  x*  to  satisfy 


subject  to 


and 


X*)  =  max/*(x) 

X 

(4.17) 

p^x*  <  p^w* 

(4.18) 

0  <  X*  <  w 

(4.19) 

M 

^X*  <  w 

(4.20) 

i=l 

n 

(4.21) 

j=i 


In  this  economy,  agents  whose  goods  are  primarily  among  V  are  termed  demanders,  and  those 
whose  goods  are  primarily  among  S  are  called  suppliers.  Demanders  have  utility  funetions  that 
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favor  small  values  of  the  goods  they  have  in  demand.  In  essence,  they  sell  their  demands  to 
suppliers.  Suppliers,  buy  the  demands  by  selling  their  supplies,  and  have  utility  functions  which 
favor  small  values  of  their  supplies. 

[There  seems  to  be  a  problem  in  the  pricing  structure  as  stated  so  far.  I  haven’t  yet  tied  in  the 
constituitive  relationship,  nor  excluded  the  possibility  of  an  agent  selling  his  demands  to  another 
agent  for  demands.  Also,  the  notion  of  time  to  completion  is  not  yet  entered  in  as  an  explicit  part 
of  the  model.] 

Example  4.7.1  Woofers  (di)  and  weefers  (d2)  are  made  out  of  widgets  fsij  and  wadgets  (S2)  according  to 
the  formula 

d\  =  2^1  “t“  3s2 

d2  =  4si  +  S2 

Let  Adam  (Xi),  Eve  (X2),  Cain  (X^)  and  Abel  (X4)  be  four  agents,  where  Adam  and  Eve  are  demanders, 
and  Cain  and  Abel  are  suppliers.  The  initial  allocation  to  these  agents  is 

=  (2,2,  0,0) 

(Adam  wants  2  woofers  and  2  weefers,  and  starts  with  no  supplies) 

=  (4, 5,  0,0) 

(Eve  wants  4  woofers  and  5  weefers,  and  starts  with  no  supplies) 

w3  =  (0,0,4,4) 

( Cain  can  supply  4  each  of  widgets  and  wadgets)  and 

=  (0,0,  3,  3) 

(Abel  can  supply  3  each  of  widgets  and  wadgets).  Clearly  not  everyone  is  going  to  be  fully  satisfied  since 
there  are  insufficient  materials. 

The  utility  functions  are 

-  (^2)^ 

(Adam  wants  to  drive  down  the  number  of  demands  he  has,  by  selling  them  off). 

Or  perhaps  what  we  want  is  to  incorporate  the  consituitive  relationships  right  from  the  beginning:  we 
want  to  drive  the  demands  down  to  zero,  by  increasing  the  corresponding  supplies.  Maybe  we  should  have 

/^x^)  =  -{x\)^  -  (xl)'^  +  [x\{2xl  +  x\)f  +  [xl{4x\  +  2,x\)f, 

and  similarly  for  the  other  functions. 

/2(x2)  =  -10(x?)2-5(xi)2 

(Eve  wants  to  drive  down  the  number  of  demands  she  has,  by  selling  them  off,  but  appears  to  be  more 
demanding;  should  we  call  her  Lillith?) 

/3(x3)  =  -(xi)2  -  (xl)2 

( Cain  is  happy  to  get  rid  of  his  supplies) 

r(x^)  =  -(x^)2  -  (4)2 

(Abel  is  happy  to  get  rid  of  his  supplies).  □ 
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4.7.1  Ant  pile 

An  interesting  and  potentially  rich  problem  for  study:  There  is  a  pile  of  resource  —  “food”  —  which 
is  sought  by  M  teams  of  agents.  We  will  use  the  abbreviation  “ants”  for  these  agents.  The  goal  of 
the  game  is  for  each  team  to  transport  as  much  of  the  food  as  possible  to  their  base.  Members  of 
the  teams  may  be  endowed  with  different  physical  capabilities,  such  as  carriers,  blockers,  guards, 
etc.,  and  with  varying  cognitive  abilities.  This  therefore  may  be  considered  a  generalization  of  the 
capture  the  flag  game  studied  by  Goodrich  [88,  page  72].  Rather  than  having  a  single  flag,  the  food 
resource  may  be  viewed  as  a  large  heap  of  flags.  Thus,  a  single  game  provides  more  opporunity 
for  dynamics  to  develop  and  continue  than  in  single-flag  flag  play,  and  the  results  of  the  game  may 
better  reflect  ensemble  properties  of  play.  A  variety  of  variations  on  the  basic  game  are  possible, 
such  as: 

•  The  location  of  the  food  may  be  initially  unknown,  reguiring  some  mapping  capability.  Or 
the  location  may  change  from  time  to  time. 

•  The  size  of  the  pile  of  food  may  vary  physically,  becoming  smaller  as  food  is  depleted  from 
it,  and  hence  requiring  more  travel  time. 

•  The  ants  may  have  various  sensory  limitations  placed  upon  them.  For  example,  they  may 
play  at  night,  being  only  able  to  communicate  and  locate  by  touch. 

•  Coalition  play  may  be  possible,  where  the  winning  score  is  to  some  coalition  of  teams. 

•  Time  deadlines  may  be  imposed.  In  this  mode,  even  single-team  play  becomes  interesting, 
as  the  team  organize  must  move  the  maximum  amount  of  food  in  the  given  time. 

•  Modeling  details  such  as  incorporating  the  cost  of  deliberation/negotiation  may  be  included. 

•  Strength  abilities  of  an  ant  may  also  be  interesting  to  model,  so  that  ants  may  carry  different 
loads. 

The  overall  goal  of  the  problem  is  to  provide  a  framework  in  which  interesting  negotiation  can  take 
place,  then  to  look  for  principles  of  negotiation  dynamics  which  may  have  general  applicability. 

4.7.2  Ant  postures 

We  model  the  behavior  dynamic  of  an  ant  by  assuming  that  an  ant  may  assume  different  postures 
at  different  postures.  These  postures  include  elements  of  the  set 

V  =  {foodward,homeward,passer,defender,blocker,attacker}. 

(Other  possibilities  may  also  arise.)  These  describe  the  following  aspects  of  behavior. 

foodward  A  foodward  ant  is  moving  toward  generally  toward  the  food,  with  possible  deviations 
to  avoid  ants  either  on  its  own  or  other  teams. 
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homeward  A  homeward  ant  is  moving  generally  toward  its  home  base  (with  possible  deviations 
to  avoid  ants  either  on  its  own  or  other  teams. 

passer  A  passer  is  an  ant  who  passes  his  food  to  another  ant  (capable  of  accepting  it).  This 
behavior  may  lead  to  daisy-chaining  as  a  means  of  food  transport. 

defender  A  defender  is  an  ant  who  defends  other  ants  on  his  own  team,  making  it  possible  for 
them  to  either  carry  food  or  to  reach  the  food. 

blocker  A  blocker  ant  attempts  to  impede  the  progress  of  an  ant  from  another  team  toward  its 
goal,  either  food  or  home. 

attacker  An  attacker  ant  takes  a  more  agressive  role  of  actually  attacking  an  ant  (rather  than  just 
blocking  it). 

Extending  the  role  of  attacker,  it  might  be  interesting  to  incorporate  an  ability  to  steal  food, 
perhaps  if  several  attackers  surround  an  ant  carrying  food. 

Subset  postures  —  assuming  more  than  one  of  these  elemental  roles  —  may  also  be  possible. 
Which  posture  to  an  ant  should  assume  is  a  itself  a  collective  decision  problem. 

It  may  be  interesting  to  model  the  carrying  of  the  food  as  a  modification  of  the  mass:  a  food¬ 
carrying  agent  can’t  move  as  fast. 

Another  modification  might  be  tiredness:  the  longer  an  ant  moves  carrying  a  load,  the  smaller 
the  range  of  forces  they  can  apply  to  movement.  (This  would  tend  to  motivate  the  idea  of  passing 
on  to  another,  fresher,  ant.) 

4.7.3  Some  selectability  and  rejectability  functions 

Selectability  should  describe  an  agent’s  purpose,  or  objective.  Rejectability  should  describe  the 
penalty  or  cost  of  control. 

I  will  now  try  to  establish  some  reasonable  selectability  and  rejectability  functions  for  various 
postures. 

foodward  The  selectabilitiy  is  simply  a  function  of  the  distance  from  the  agent  to  (the  estimated 
location  of)  the  food. 

The  rejectability  is  based  on  a  desire  to  both  evade  blockers  and  attackers,  avoid  moving  into 
a  location  where  there  is  another  agent,  and  to  conserve  fuel. 

homeward  The  selectability  is  simply  a  function  of  the  distance  from  the  agent  to  “home”.  Alter¬ 
natively,  if  a  passer  with  more  ability  is  sufficiently  close,  a  homeward  agent  may  choose  to 
transfer  its  load  to  the  passer  (this  may  depend  on  other  aspects  of  the  passer,  such  as  how 
free  it  is  from  attackers  and  blockers). 

The  rejectability  is  based  on  a  desire  to  evade  blockers  and  attackers,  avoid  moving  into  a 
location  where  there  is  another  agent,  and  to  conserve  fuel. 
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passer  A  passer  not  carrying  food  has  selectability  based  on  getting  food  from  an  agent  carrying 
food.  (Of  course,  there  must  be  some  coordination,  willingness  on  the  part  of  the  agent  with 
food  to  transfer  it  to  the  passer  agent.) 

The  rejectability  is  based  on  a  desire  to  evade  blockers  and  attackers,  avoid  moving  into  a 
location  where  there  is  another  agent,  and  to  conserve  fuel. 

defender  A  defender  has  as  selectable  choices  those  that  place  it  between  agents  of  the  opposing 
team  that  may  block  or  attack  and  members  of  its  own  team.  More  effectively,  a  defender 
should  incorporate  dynamic  models  of  members  of  both  teams.  Also,  consideration  should 
be  given  (as  the  model  develops)  to  obtain  coordinated  behavior  among  blockers  and  attack¬ 
ers. 

The  rejectability  is  based  upon  a  desire  to  conserve  fuel  and  avoid  squares  where  other  agents 
are. 

blocker  A  blocker  has  as  selectable  choices  those  that  place  them  in  the  path  of  a  foodward  or 
homeward  agent,  or  that  can  block  a  blocker  from  accomplishing  its  task. 

attacker  An  attacker  wants  to  attack  agents  of  the  other  team,  incapacitating  them. 

4.7.4  Knowledge  corpi 

We  may  also  want  to  explore  various  endowments  of  knowledge  upon  the  agents,  and  determine 
behaviors  as  a  function  of  how  much  they  know.  Here  are  some  possiblities: 

•  Full  knowledge:  every  ant  on  every  team  has  knowledge  of  the  state  of  every  ant  on  every 
other  team. 

•  Full  knowledge  +  anticipation:  every  ant  on  every  team  has  knowledge  of  the  state  of  every 
ant  on  every  other  team,  plus  has  the  ability  to  predict  something  about  the  state  of  ants  over 
some  horizon  into  the  future. 

•  Local  knowledge:  ants  are  aware  only  of  those  other  ants  in  some  neighborhood  around 
them.  Anticipation  may  also  be  incorporated. 

Also,  various  models  of  command  structure  may  be  explored.  For  example,  a  top-down  structure 
in  which  a  single  ant  directs  a  team  might  be  employed,  or  a  fully  distributed  structure.  It  may 
even  be  possible  to  explore  changing  the  structure  based  on  another  decision. 

4.7.5  Some  initial  notation 

We  now  establish  some  notation  for  this  game.  The  Ah  team  is  denoted  7),^  and  the  number  of 
members  on  7)  is  n*.  Ant  j  on  team  7)  is  denoted  as  7)(j). 

^This  differs  from  the  notation  in  [88],  where  1C  is  used  to  denote  a  coalition.  However,  since  coalitions  may  be 
built  using  members  of  teams,  we  introduce  notation  to  distinguish  teams  from  coalitions. 
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Let  xrj(j)(t)  denote  the  dynamical  state  of  ant  %{])  at  time  t.  The  dynamics  are  governed  by 
the  general  equation 

+  1)  =  fr,o')(xr,o-)(t),ur,o-)(f),f) 

where  VLr^(j)  (t)  is  an  input  function  at  time  t.  To  avoid  the  multiple  subscripts,  we  will  also  employ 
a  notation  such  as  x*,  where  the  team  and  member  identification  are  implicit.  Thus  xbfi  and  x^, 
i  ^  j,  refer  to  the  state  of  two  different  ants.  For  notational  brevity,  we  will  also  let  x(t)  denote  the 
state  for  a  general  agent.  Assuming  that  play  occurs  on  a  two-dimensional  space  with  coordinate 
{x,  y),  a  state  vector  for  Newtonian  dynamics  is 


x(t) 


'x{t) 

yit) 

x{t) 


and  the  state  update  equation  can  be  written  as 


pW 

v(t) 


where 


and 


x(t  +  1)  =  Ax(t)  +  Bw{t) 
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with  m  being  the  mass  of  the  agent,  T  being  the  sample  time,  and  F^it)  and  Fy(t)  are  the  forces 
applied  in  the  x  and  y  directions  at  time  t. 

Let  o'ri{j)it)  denote  the  posture  of  the  agent  at  time  t.  Strictly  speaking,  this  should  form  part 
of  the  state,  so  it  may  be  convenient  sometimes  to  form  the  augmented  state 


x(f) 


x(f) 

a{t) 


For  purposes  of  coordination,  it  is  necessary  to  an  ant  i  retain  an  estimate  of  the  state  of  other 
ants.  We  will  let  x*  (f)  denote  ant  Ts  estimate  of  ant  j’s  state  (or,  for  the  augmented  state,  ^  (t)). 


4.8  Qualitative  structural  properties  for  multiagent  systems 

In  this  section,  a  completely  different  perspective  is  sought  on  the  multiagent  problem.  Rather 
than  seeking  a  particular  control  algorithm  or  a  particular  negotiation  strategy,  general  structural 
properties  are  sought  using  the  techniques  of  catastrophe  theory,  or,  under  its  more  modern  but  less 
dramatic  name,  bifurfaction  theory.  This  theory  can  accomodate  general  arguments,  and  leads  to 
some  semi-qualitative  results. 
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4.8.1  Introduction 


We  consider  the  qualitative  effectiveness  and  structural  stability  of  a  generic  multiagent  system, 
where  the  structural  stability  is  considered  as  the  number  of  agents  and  their  proximity  is  varied. 
The  intent  is  not  to  obtain  particular  numerical  answers,  but  to  explore  structural  (or  topological) 
aspects  of  the  system  as  parameters  vary.  The  tools  used  in  this  exploration  derive  from  those 
of  catastrophe  theory  [89,  90,  91,  92].  Using  catastrophe  theory,  the  phase  transitions  which  are 
known  to  occur  in  problems  of  multiagent  systems  are  accounted  for  by  folds  in  the  catastrophe 
manifold  associated  with  the  system. 


4.8.2  Catastrophe  theory 

Catastrophe  theory  was  a  topic  of  considerable  research  in  the  1970’s,  has  since  fallen  into  relative 
dis-use.  However,  the  mathematical  models  remain  valid,  even  if  some  of  the  applications  are  sus¬ 
pect  (including,  perhaps,  even  this  one).  Even  in  its  weakest  applications,  catastrophe  theory  can 
provide  effective  metaphors  to  describe  complex  behavior,  even  if  it  does  not  provide  a  justified 
explanation  [90,  p.  128].  Catastrophe  theory  is  well-suited  to  problems  in  the  softer  sciences, 
where  underlying  mechanisms  are  frequently  not  understood,  or  in  large  problems,  such  as  multi¬ 
agent  systems,  where  collective  behavior  is  observed  but  is  difficult  to  describe  using  conventional 
localized  analysis. 

Catastrophe  theory  deals  with  the  structural  properties  and  qualitative  nature  of  smooth  func¬ 
tions  parameterized  by  continuous  sets  of  parameters.  For  example,  f{x;  u,v)  =  x'^  +  xu  +  v  is 
a  set  of  functions  (in  the  variable  x)  with  parameters  u  and  v.  In  catastrophe  theory,  the  struc¬ 
tural  stability  of  such  parameterized  functions  are  examined:  if  the  parameters  vary  slightly,  does 
the  function  retain  its  qualitative  form?  Since  the  functions  are  smooth  and  the  parameters  vary 
continuously,  the  answer  is  usually  yes,  but  there  are  some  values  of  the  parameters  at  which  the 
function  undergoes  a  qualitative  change  near  the  critical  points  of  the  function.  For  example,  the 
function  /(x;  0, 0)  =  x'^  has  only  one  repeated  root  (at  zero),  and  no  minima  or  maxima,  only  a 
point  of  inflection.  The  function  /(x;  — m,  0)  =  has  roots  at  a;  =  0  and  a;  =  ±e  for  the 

parameters  values  u  =  and  w  =  0  —  there  is  now  a  maximum  value  for  a;  <  0  and  a  minimum 
value  for  a;  >  0.  Thus  the  structural  nature  of  the  function  near  a;  =  0  is  changed  by  the  change 
from  u  =  0  to  M  >  0.  This  change  in  behavior  is  what  is  called  a  catastrophe  —  the  word  in 
this  context  refers  to  a  sudden  change  in  qualitative  nature,  as  opposed  to  a  disastrous  change. 

Since  catastrophe  theory  examines  qualitative  changes  in  the  functions,  as  opposed  to  particular 
numerical  values,  another  aspect  of  the  theory  is  that  it  only  provides  distinctions  up  to  smooth 
changes  of  variables  (diffeomorphisms).  So,  for  a  smooth  change  of  variables  y  =  (j){x),  the 
function  f{(f){y)]u^  v)  may  be  examined  for  the  same  qualitative  behavior  near  its  critical  points 
as  for  f{x]u,v)  (provided  that  0  is  free  from  critical  points  in  the  appropriate  neighborhoods). 
Similar  changes  of  variables  are  allowed  for  the  parameters.  Frequently,  a  change  of  variables  is 
employed  to  move  the  critical  points  of  a  given  function  to  some  convenient  location  (such  as  the 
origin).  Thus,  catastrophe  theory  provides  for  classification  of  the  critical  points  of  functions  up  to 
dijfeomorphisms. 
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One  of  the  key  results  of  catastrophe  theory  is  that  all  systems  which  exhibit  structural  changes 
having  up  to  four  parameters,  can  be  classified  (up  to  diffeomorphism)  into  one  of  seven  canonical 
forms.  Such  systems  will  exhibit  behavior  indicative  of  the  structural  change,  such  as  jumping, 
hysteresis,  sensitivity  to  parameter  changes,  and  unstable  regions. 

Catastrophe  theory  has  been  used  to  revisit  a  variety  of  problems  in  the  physical,  biological,  and 
social  sciences.  One  example  which  seems  particularly  germane  is  the  use  of  catastrophe  theory 
to  account  for  phase  transitions  in  thermodynamic  systems.  This  provides  a  useful  metaphor,  as 
phase  transitions  in  multiagent  systems  are  also  observed. 


4.8.3  A  Catastrophic  Work  model 


We  examine  a  model  for  the  amount  of  work  accomplished  by  a  set  of  N  agents  working  together. 
In  this  model,  two  aspects  of  the  interagent  work  are  presented.  First  is  the  conventional  division 
of  labor  —  the  work  is  split  N  different  ways  (except  for  some  “administrative”  overhead),  where 
individual  difference  in  agent  abilities  are  not  accounted  for.  This  time  to  complete  the  work  is 
thus  roughly  proportional  to  1/N.  There  is  also  a  term  to  account  for  the  multiple  interactions:  N 
agents  may  interact  with  each  other  in  0{N‘^)  ways,  and  the  interaction  occurs  in  a  way  to  decrease 
the  time  to  completion.  While  this  particular  model  can  (and  should)  be  dissected  and  discarded  as 
unrealistic  and  too  simplistic,  we  argue  that  models  which  similarly  exhibit  quadratic  dependence 
on  the  number  of  agents  (due  to  agent  interaction)  combined  with  some  linear  dependence  on 
N  in  such  a  way  that  a  cubic  term  arises,  under  diffeomorphic  transformations  will  likely  form 
the  canonical  cusp  which  appears  under  this  model.  If  more  complicated  nonlinear  models  are 
employed,  they  are  subject  also  to  catastrophe  theory:  if  they  don’t  form  this  canonical  cusp, 
another  canonical  catastrophe  likely  will  be  evident. 

Let  A  represent  an  amount  of  work  to  be  accomplished  (represented  in  units  of  man-hours) 
by  N  agents.  In  a  typical  situation  involving  multiple  workers,  there  is  some  worker  overhead 
associated  with  the  workers.  We  represent  this  overhead  by  b.  There  is  also  some  additional  time 
overhead  associated  with  each  problem,  which  we  will  denote  as  C.  Thus,  if  all  that  accrues  from 
the  presence  of  multiple  workers  is  a  division  of  labor  (with  some  administrative  overhead)  we  can 
model  the  time  to  accomplish  work  A  as 


T 


C  + 


A 

N-b' 


However,  in  a  multiagent  scenario,  there  is  usually  some  additional  benefit  from  being  able  to  work 
cooperatively.  We  will  model  the  additional  benefit  as  being  quadratic  in  the  number  of  workers 
N.  Thus  there  is  a  decrease  in  time  due  to  multiple  workers  of  EN"^.  ^  Under  this  assumption,  the 
time  to  complete  the  task  is  thus 

A 

T  =  C+ - -  -  EN^.  (4.22) 

N  —  b 


^  A  better  model  would  exhibit  quadratic  behavior  initially,  but  with  some  saturation  effect  as  the  number  of  workers 
increases,  as  in  E  tanh(/iV^),  for  appropriate  model  constants  E  and  /.  However,  results  in  catastrophe  theory  depend 
only  on  the  lower-order  terms  in  the  Taylor  series,  so  the  model  still  works  up  to  diffeomorphism,  provided,  of  course, 
that  there  is  some  limit  to  the  values  of  N  employed,  since  negative  time  to  complete  a  task  is  meaningless. 
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This  heuristic  model  is,  of  course,  valid  only  over  values  of  N  that  lead  to  a  positive  time  to 
complete  the  task.  In  the  analysis  that  follows,  we  neglect  the  discreteness  of  the  number  of  agents 
(or  argue  that  only  portions  of  agents  may  be  employed)  so  that  T  varies  smoothly  with  N.  For 
simplicity  we  set  C  =  0,  which  mathematically  corresponds  to  a  simple  change  of  variables,  but 
in  the  model  may  lead  to  negative  completion  times. 

The  model  (4.22)  is  in  many  ways  similar  to  the  gas  equation  of  van  der  Waals  equation  from 
thermodynamics  [93], [89,  p.  327].  In  this  analogy,  our  time  T  is  analogous  to  thermodynamic 
pressure  P,  the  amount  of  work  A  is  analogous  to  temperature,  and  the  number  of  workers  is 
analogous  to  the  volume  V.  The  catastrophe  theoretic  interpretation  of  the  van  der  Waals  equation 
has  been  used  to  account  for  phase  transitions  between  solid,  liquid,  gas,  and  fluid  states  of  matter. 
Due  to  its  similarity,  we  expect  (4.22)  to  exhibit  similar  phase  transition  behavior. 

The  critical  points  of  (4.22)  occur  where  dT/dN  =  0  and  d^T /dN"^  =  0,  which  is  when 

b  .  8b^E  ^  b‘^E 

3  27  3 

We  designate  these  values  as  Nc,  Ac  and  Tc.  Letting 

T'  =  T/Tc  N'  =  N/Nc  A'  =  A/ Ac 


and  C  =  0  we  change  coordinates  so  that  the  critical  point  is  at  (1,1,1).  By  this  transformation, 
(4.22)  becomes 


T' 


SA' 

3iV'  -  9 


1 

3 


{N'f 


The  coordinate  system  is  now  shifted  so  that  the  critical  point  is  at  (0,  0, 0)  by  letting  t  =  T'  — 
n  =  TV'  —  1,  and  a  =  A'  —  1.  This  gives  rise  to  the  equation 


in?'  +  un  +  V 


(4.23) 


where 

u  =  3t  V  =  8a  —  Qt 

The  solutions  to  (4.23)  for  values  of  u  and  v,  which  are  diffeomorphically  related  to  T  and  A, 
over  some  continuous  range  defines  a  surface,  known  as  the  catastrophe  manifold,  and  sketched 
in  figure  4.5.  Equation  (4.23)  is  the  classic  equation  of  the  cusp  catastrophe  (see,  e.g.,  [90,  p.  42] 
or  [89,  p.  78ff  or  p.  174ff].  The  lines  marked  B  in  the  parameter  space  denote  the  bifurcation 
set,  at  which  the  catastrophe  manifold  jumps  from  one  sheet  to  another.  Interior  to  those  lines, 
the  manifold  exhibits  three  values,  one  of  which  is  “unattainable”  (or  unstable).  Outside  of  the 
bifurcation  set,  there  is  only  a  single  sheet  in  the  manifold. 

This  cusp  exhibits  the  classic  attributes  of  catastrophe,  some  of  which  we  sketch  here  [90,  p. 

12]: 

Sudden  jumps  As  the  parameter  v  is  varied  across  its  parameter  space  (such  as  on  the  line  h 
shown),  the  path  pi  on  the  cusp  much  jumps  from  one  sheet  of  the  catastrophe  manifold  to 
another  sheet. 
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Hysteresis  A  path  in  one  direction  across  the  (m,  v)  parameter  space,  followed  by  a  return  path, 
does  not  necessarily  result  in  the  same  path  across  the  folded  sheet.  (Think  of  traversing  the 
line  /i,  then  traversing  again  in  the  reverse  direction.) 

Divergence  Divergence  is  sensitivity  to  initial  conditions.  This  is  exhibited  for  the  path  starting 
from  P  and  moving  along  similar  paths  in  parameter  space,  but  arriving  on  different  sheets 
of  the  fold.  Nearby  trajectories  in  parameter  space  can  have  significantly  different  behavior. 

The  emergence  of  the  cusp  catastrophe  indicates  that  in  this  multiagent  model,  there  will  be 
phase  transitions  as  the  parameters  (amount  of  work  or  time  to  complete)  varies,  corresponding  to 
the  folds  of  the  manifold. 


4.8.4  A  structural  stability  look  at  multiagent  control 

There  is  another  rather  different  model  that  can  be  employed  to  describe  aspects  of  multiagent 
control.  We  consider  the  “effectiveness”  of  a  multiagent  system  over  a  distributed  domain.  (For 
a  closely  related  model  in  ecology,  see  [89,  eh.  16].)  The  analysis  is  based  on  some  heuristic 
concepts  of  multiagent  behavior: 


•  There  is  an  advantage  to  multiagent  cooperation,  with  per-agent  effectiveness  increasing 
as  the  number  of  agents  increases.  Some  tasks  simply  require  more  agents  to  carry  them 
through.  Some  tasks  are  intrinsically  distributed.  Others  may  benefit  from  a  division  of 
labor  [94,  p.  4]. 

•  The  benefits  from  coordination  are  limited  however.  Eventually  saturation  effects  come  into 
play,  and  a  point  of  diminishing  returns  is  reached.  For  example,  benefits  accruing  from 
division  of  labor  reach  a  saturation  point  when  tasks  can  not  be  further  subdivided.  On  the 
basis  of  this  observation,  we  postulate  that 


Letting  S(N)  denote  the  agent  “effectiveness”  per  agent  as  a  function  of  the  number  of  agents  N, 
we  have  in  figure  4.6  a  plot  of  S(N)  and  dS/dN  that  represent  these  two  heuristic  concepts  of 
multiagent  behavior.  The  second  concept  can  be  stated  as 


lim 
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At  the  same  time  there  are  advantages  in  a  multiagent  system,  there  are  also  costs  associated  with 
cooperation: 


•  We  model  the  cost  as  being  related  to  the  distance  between  agents.  Thus,  it  might  represent 
costs  associated  with  transportation  or  communication. 

We  model  the  agents  as  being  distributed  over  some  problem  domain  of  “size”  R.  The  problem 
domain  may  be  a  geographical  one,  where  i?  is  a  measure  of  the  size  of  the  region.  Or,  the  problem 
domain  may  be  a  cognitive  one,  where  i?  is  a  measure  of  the  number  of  subtasks  to  be  performed 
in  the  completion  of  some  assigned  work.  We  denote  the  “density”  of  the  agents  in  the  domain  as 
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Figure  4.6:  Effectiveness  per  agent  as  a  function  of  number  of  agents,  and  its  derivative 


D.  For  a  geographical  domain,  D  represents  the  actual  average  density,  as  agents  per  unit  area.  For 
a  cognitive  domain,  the  density  might  be  a  measure  of  the  skill  level  of  the  agents  (for  example, 
how  many  sub  tasks  they  are  equipped  to  deal  with).  The  effective  number  of  agents  N  in  the 
system  with  size  R  and  density  D  is  an  increasing  function  of  the  domain  size  and  the  density.  For 
example,  for  a  geographical  domain,  we  might  have  N  =  tiR^D. 

To  model  the  cost  of  agent  interaction,  let 


N 

E{N)=C-d-. 

denote  how  the  agents’  effectiveness  decreases  as  the  number  of  agents  increases,  where  C  and  d 
are  constants. 

The  total  effectiveness  combines  the  effectiveness  due  to  agent  interaction  and  the  cost  of  agent 
interaction: 

F{N)  =  E{n)  +  S{N)  =  C  -  dN/D  +  S{N) 

Then 

dE  _  dS  _  d 
dN  ~  dN~D 

Various  qualitative  forms  of  dE/dN  are  obtained  as  d/D  varies.  These  are  shown  on  the  left  of 
figure  4.7.  On  the  basis  of  the  shape  dS/dN  and  depending  on  the  size  of  d/D,  the  function 
dE/dN  may  have  one  zero,  multiple  zeros,  or  no  zeros.  The  corresponding  total  effectiveness  is 
shown  plotted  next  to  its  derivative  (on  the  right  of  figure  4.7). 

If  the  density  of  agents  is  sufficiently  high  —  and  the  task  distribution  is  such  that  each  agent 
is  kept  productively  occupied,  then  for  high  density  the  effectiveness  is  as  shown  in  figure  4.7(b) 
—  there  is  a  unique  value  of  N  maximizing  the  effectiveness.  Beyond  that  point,  diminishing 
returns  reduce  effectiveness.  As  the  density  of  productively  occupied  agents  decreases,  in  figure 
4.7(e)  and  (g),  the  position  of  most  effectiveness  first  decreases,  then  it  is  more  effective  to  work  in 
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isolation  figure  4.7(h).  These  function  values  can  be  viewed  as  being  slices  from  some  catastrophe 
manifold. 


Figure  4.7:  The  derivative  of  the  effectiveness,  and  the  effectiveness,  as  a  function  oi  d/ D 


Figure  4.8.4  illustrates  the  plot  of  the  maxima  of  F{N)  as  a  function  of  the  parameter  d/ D. 
As  d/D  increases,  the  maximum  decreases  smoothly.  However,  there  reaches  a  point  (as  in  figure 
4.7(f)  where  the  maximizing  value  drops  suddenly  to  the  lowest  value  of  N.  There  is  thus  a  “jump” 
—  a  phase  transition  —  in  the  population  of  agents  that  can  be  supported,  and  a  smaller  number 
of  agents  is  more  acceptable.  For  larger  values  of  d/D,  only  the  minimum  number  of  agents  is 
acceptable. 


Ill 
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Figure  4.8:  Plot  of  most  efficient  iV  as  a  function  of  the  parameter  d/D.  A  phase  transition  of  sorts 
of  observed. 


One  point  that  may  be  made  is  that  if  the  agents  work  closely  together,  where  the  “density”  is 
high,  then  it  is  more  efficient  to  have  more  agents.  If  the  problem  domain  is  large  enough  that  the 
agent  density  is  low,  more  efficiency  is  gained  by  independent  operation.  (This  has  interesting  im¬ 
plications  as  applied  to  intellectual  endeavors.  People  with  close  interests  may  work  synergistically 
on  a  problem,  while  people  with  only  related  interests  may  get  in  each  others’  way.) 

4.8.5  Discussion 

It  must  be  conceded  that  models  described  above  do  not,  in  fact,  represent  any  real  system  of 
agents.  However,  the  assumptions  in  the  models  are  based  on  practical  considerations  of  the  qual¬ 
itative  way  that  multiagent  systems  can  operate.  Up  to  diffeomorphism,  these  are  feasible  models 
for  a  system.  Furthermore,  it  known  that  multiagent  systems  can  experience  a  “phase  transition.” 
This  is  typically  seen  at  some  sort  of  boundary  between  “easy”  systems  —  typically  where  the 
number  of  tasks  is  much  less  than  the  number  of  agents  —  and  “hard”  systems  —  where  the  re¬ 
sources  of  the  system  are  stressed  by  the  demands  placed  upon  it.  An  ongoing  research  question  is 
to  explore  the  nature  of  the  phase  transition.  The  simple  model  presented  here  sheds  light  on  the 
nature  of  the  phase  transition. 
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Appendix  A 

TaskSim  Users  Manual 


A.l  Introduction 

The  TaskSim  simulation  was  developed  by  the  USU  Autonomous  Negotiating  Teams  (ANTs)  re¬ 
search  group  at  Utah  State  University,  as  part  of  the  USU  Analytic  Prediction  of  Emergent  Dynam¬ 
ics  (APED)  project.  The  simulation  will  model  a  limited  number  of  resources  as  they  are  assigned 
to  travel  to  and  do  work  on  different  jobs.  We  hope  to  employ  rate  equations  and  praxeid  theory 
along  with  the  simulation  to  develop  an  understanding  of  emergent  dynamics  in  ANT  systems. 
This  manual  can  be  found  online  at  http :  /  /  ssl .  usu  .  edu/paul/tasksim/ manual  /  . 


A.2  The  TaskSim  scenario 

In  the  TaskSim  world,  there  are  jobs  (or  tasks)  that  need  to  be  done,  and  resources  that  can  do 
them.  There  is  also  a  scheduler,  or  perhaps  series  of  schedulers,  that  control  one,  some,  or  all 
of  the  resources.  There  is  never  a  way  for  the  schedulers  to  know  exactly  when  or  where  new 
jobs  will  pop  up,  but  when  they  do,  the  schedulers’  responsibility  is  to  assign  resources  to  do  the 
jobs.  Jobs  have  certain  properties:  they  have  some  amount  of  work  that  needs  to  be  done,  and  they 
have  a  deadline  before  which  the  work  should  be  completed.  If  a  job  isn’t  done  on  or  before  its 
deadline,  it  ceases  to  exist,  and  it  is  counted  as  a  failure.  When  a  job  is  completed  on  or  before  its 
deadline,  it  ceases  to  exist,  and  it  is  counted  as  a  success.  The  number  of  successes  and  failures  are 
counted  during  a  simulation.  Resources  might  do  work  on  jobs  at  different  rates.  The  rate  at  which 
a  resource  can  work  on  a  certain  job  is  termed  the  resource’s  proficiency  at  that  job.  There  may 
be  different  types  of  jobs,  in  which  case  resources  may  have  different  proficiencies  for  each  type. 
Resources  may  gain  or  lose  proficiency.  Jobs  may  have  dependencies  on  other  jobs,  meaning  that 
the  other  jobs  must  be  completed  before  any  work  can  be  done  on  the  job  in  question.  If  a  job  fails, 
all  jobs  which  are  dependent  on  that  job  fail  as  well.  Schedulers  may  or  may  not  take  dependencies 
into  account. 
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A.3  The  Simulation 


The  TaskSim  simulation  version  2.x  models  a  two-dimensional  world.  Jobs  and  resources  both 
have  locations  in  that  world,  and  resources  can  only  do  work  on  a  job  if  they  are  at  the  same 
location.  Resources  need  to  move  to  get  to  jobs,  and  they  do  so  at  a  constant  speed.  Many  resources 
may  occupy  the  same  location.  The  simulation  can  use  any  of  several  allocation  methods.  Each 
method  is  built  into  a  plugin,  or  module  (a  file  with  a  .so  extension)  so  that  users  may  create 
their  own  modules.  Doing  so  is  beyond  the  scope  of  this  document.  When  the  simulation  is  run, 
the  user  may  select  from  the  available  modules.  There  are  three  types  of  jobs  in  the  simulation, 
called  “Green”,  “Red”,  and  “Blue”  jobs.  Resources  start  out  with  the  same  proficiency  in  all  job 
types,  but  if  the  “Proficiency  Gain”  option  is  enabled,  they  may  become  specialized  in  one  or  more 
types.  See  Simulation  options:  Resource  proficiency  gain  for  more  info.  Jobs  are  added  to  the 
simulation  either  by  the  user  (only  in  the  interactive  simulation)  or  at  “random”.  When  added  at 
random,  the  job’s  work  amount  and  deadline  are  chosen  from  configurable  uniform  distributions. 
Job  locations  are  chosen  uniformly  on  the  world  grid.  Another  quantity,  the  amount  of  time  before 
the  next  random  job  appears,  is  chosen  from  the  third  configurable  distribution.  See  Simulation 
options:  Random  distribution  settings  for  more  info.  A  simulation  begins  with  a  certain  number  of 
resources,  and  that  number  can  not  change  for  the  life  of  the  simulation.  The  number  of  resources 
with  which  the  simulation  starts  is  configurable. 

Running  the  simulation  The  TaskSim  simulation  version  2.x  may  be  run  in  several  ways.  When 
invoked  as  “armybase”,  an  interactive  window  is  brought  up  that  allows  the  user  to  watch  the 
resources  moving,  watch  the  jobs  being  completed,  add  jobs  by  clicking  on  the  map,  pause  and 
restart  the  simulation,  and  so  on.  This  allows  the  user  to  learn  what  the  rules  of  the  simulation  are, 
and  see  what  is  happening.  It  is  more  useful  mathematically  to  run  a  series  of  simulations  with  the 
same  initial  parameters  to  see  how  an  allocation  method  fares  on  the  average.  This  is  called  a  “batch 
simulation”  since  several  are  run  at  once.  Batch  simulations  can  be  run  from  the  command  line 
(without  even  any  X  display)  or  using  the  “batch  control  panel”  which  lets  the  user  set  and  change 
options  quite  easily.  Both  of  these  types  of  batch  simulations  are  run  by  invoking  “batchsim”.  The 
output  from  a  batch  simulation  is  simply  the  total  number  of  successes  and  failures,  along  with  a 
success  percentage  (the  number  of  successes  divided  by  the  total  number  of  jobs). 

A.3.1  Simulation  options 

There  are  a  number  of  options  of  which  a  user  should  be  aware.  Each  can  affect  the  outcome  of 
a  simulation.  The  settings  are  changed  by  the  user  in  different  ways  depending  on  the  style  of 
simulation  being  run. 

A.3.2  Random  distribution  settings 

As  stated  in  the  section  entitled  The  simulation,  jobs  can  be  added  to  the  simulation  at  “random”. 
This  is  always  the  case  when  running  a  batch  simulation,  since  there  is  no  way  to  add  them  in¬ 
teractively.  When  jobs  are  added  in  this  way,  there  are  several  quantities  which  are  selected  from 
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uniform  distributions.  Three  of  those  distributions  are  configurable. 

First:  Deadline.  This  refers  to  the  amount  of  time  given  to  a  job  to  be  finished.  If  the  job  is  not 
finished  before  the  deadline,  it  is  counted  as  a  failure.  The  default  deadline  range  is  40-100. 

Next,  Work  To  Do.  This  is  the  amount  of  work  that  must  be  performed  on  a  job  before  it  is 
completed.  The  default  range  is  70-600. 

Finally,  Next  Job.  This  is  the  amount  of  time  that  will  elapse  before  the  next  random  job  is 
added.  This  information  is  naturally  not  made  available  to  the  schedulers.  Lowering  the  values 
means  more  jobs  will  be  competing  for  resources,  and  raising  them  will  ease  the  workload.  The 
default  range  is  1-10. 

A.3.3  Allocation  modules 

The  method  that  a  scheduler  will  use  to  assign  resources  to  jobs  is  defined  by  the  allocation  mod¬ 
ule.  After  compilation,  these  modules  will  appear  as  files  with  a  .so  extension.  The  user  can  select 
from  any  of  the  available  modules.  Those  included  with  this  package  are: 

•  gensched.so  (Generational  Scheduling):  This  method  was  written  specifically  for  this  simu¬ 
lation  to  be  very  efficient.  Assignment  takes  into  account  job  and  resource  locations,  job  de¬ 
pendencies,  and  resource  proficiencies.  Resources  give  “bids”  reflecting  how  long  it  would 
take  them  to  do  work  on  a  certain  job,  including  travel  time,  and  the  best  bids  are  taken. 
When  there  are  not  enough  resources  available  to  accomplish  a  job,  negotiation  can  occur, 
and  resources  can  be  rescheduled  to  (or  stolen  by)  the  needy  job. 

•  democratic.so  (Democratic  Allocation):  When  a  new  job  is  added,  all  the  resources  are  split 
up  so  that  each  active  job  gets  an  equal  number  of  resources.  Location  is  not  taken  into 
account. 

•  crisis.so  (Crisis  Allocation):  When  a  new  job  is  added,  all  the  resources  are  redistributed  so 
that  jobs  get  resources  in  inverse  proportion  to  their  time  remaining.  Location  is  not  taken 
into  account. 

•  other.so  (Nothing):  No  resources  are  assigned  to  jobs. 

This  list  does  not  include  several  of  the  allocation  types  seen  in  the  earlier  (Matlab)  versions  of 
Tasks im,  known  as  Screaming  Generals  versions  0.x  and  Lx.  They  may  be  added  in  the  future. 

A.3.4  Resource  profi  ciency  gain 

If  this  option  is  enabled,  resources  will  gain  proficiency  when  working  on  a  job.  They  will  only 
gain  proficiency  in  that  job  type  during  that  time.  In  essence,  they  “learn”  how  to  do  that  type 
of  job  in  a  faster  manner.  The  amount  of  proficiency  gain  is  not  configurable.  In  the  interactive 
simulation,  resource  proficiency  is  indicated  by  a  tint  of  the  appropriate  color.  A  resource  that  is 
excellent  at  Green  jobs  will  appear  bright  green. 
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A.3.5  Resource  speed 

This  is  the  speed  a  resouree  will  move  while  traveling,  in  map  units  per  time  unit.  This  is  eurrently 
only  eonfigurable  for  bateh  simulations. 


A.3.6  Number  of  resources 

This  is  the  number  of  resourees  that  will  exist  in  the  simulation  world.  There  is  no  way  to  ehange 
it  while  a  simulation  is  running.  There  is  no  eonfiguration  widget  to  ehange  this  value  in  the 
interactive  simulation,  but  the  number  can  be  changed  by  editing  Scenario  files. 


A.3.7  Random  mode:  add  random  dependencies 

If  this  option  is  enabled,  then  when  jobs  are  being  added  at  random,  they  may  also  depend  on  other 
existing  jobs.  (A  job  that  depends  on  other  jobs  must  wait  for  them  to  be  completed  before  any 
work  can  be  done  on  the  job  in  question).  For  each  preexisting  job,  there  is  a  constant  chance  that 
a  new  job  will  depend  on  it. 


A.3.8  Random  mode:  homogenous  jobs 

If  this  option  is  enabled,  all  jobs  added  at  random  into  the  simulation  will  be  Green  jobs.  This 
removes  any  consideration  of  different  job  types  from  the  simulation. 


A.4  armybase  -  the  interactive  simulation 

The  interactive  TaskSim  simulation  is  invoked  as  “armybase”.  Armybase  requires  the  X  Window¬ 
ing  System  to  run.  The  simulation  will  immediately  begin  after  the  window  pops  up.  The  resources 
won’t  do  anything,  because  there  are  no  jobs  yet.  The  Time  counter  will  be  moving,  though.  To 
stop  the  simulation  at  any  time,  to  carefully  examine  the  progress  of  jobs  or  make  a  graph,  click 
on  the  Pause  button.  The  button  will  appear  depressed  while  the  simulation  is  paused.  To  unpause, 
click  the  pause  button  again. 


A.4.1  The  display 

This  figure  shows  the  various  parts  of  the  interactive  user  interface. 
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•  The  map  window  is  the  main  part  of  the  interfaee.  It  shows  what  is  happening  in  the  simu¬ 
lation  world. 

•  resources  are  shown  as  eireles  on  the  map  window.  They  gain  a  tint  of  red,  green,  or  blue 
if  they  gain  profieiency  in  that  type  of  job.  The  purple  resourees,  for  example,  are  good  at 
doing  Red  and  Blue  jobs. 

•  jobs  are  shown  as  radio  towers  under  eonstruetion.  There  is  a  progress  bar  under  eaeh  job, 
whose  length  depends  on  the  amount  of  work  a  job  needs  to  do.  The  amount  of  the  progress 
bar  covered  in  green,  red,  or  blue  shows  how  much  of  the  work  has  already  been  done.  There 
is  also  a  number  in  parentheses  under  each  job.  This  represents  the  number  of  resources  that 
are  committed  to  working  on  that  job  now  or  sometime  in  the  future.  The  color  of  the  number 
and  on  the  progress  bar  shows  the  type  of  the  job. 

•  A  dependency  is  shown  as  a  blue  arrow  pointing  from  one  job  to  another.  The  job  at  the 
arrow’s  point  must  wait  for  the  other  job  to  finish  before  work  may  begin  on  it. 

•  menus  and  controls  -  These  allow  the  user  to  control  what  the  simulation  is  doing,  add  jobs, 
restart,  and  so  on. 

•  The  meters  show  more  information  about  the  successes  and  failures  of  jobs.  The  green 
(middle)  meter  shows  the  number  of  successfully  finishing  jobs  each  time  unit.  The  red 
(bottom)  meter  shows  the  number  of  failures  each  time  unit.  Failures  are  typically  bunched 
together,  especially  with  dependencies  on.  The  black  (top)  meter  only  shows  data  for  some 
allocation  methods.  It  shows  the  number  of  negotations  that  took  place  between  jobs  during 
each  time  unit. 
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•  The  status  bar  shows  the  number  of  jobs  that  have  taken  plaee  sinee  time  0  (Events  past), 
the  number  of  jobs  seheduled  to  appear  in  the  future  (this  is  possible  with  Seenario  files), 
the  number  of  jobs  that  have  succeeded  and  failed  since  time  0,  and  the  “Mode”.  The  mode 
shows  whether  jobs  are  currently  being  added  at  random. 

A.4.2  Adding  jobs 

To  manually  add  a  job,  click  on  one  of  the  Tower  buttons  in  the  controls.  Then  click  somewhere 
on  the  map  window.  A  job  will  appear,  and  resources  will  (maybe)  rush  to  complete  it.  Jobs  may 
also  be  removed  manually,  although  the  scheduler  won’t  reschedule  anything  after  they’re  gone. 
To  do  this,  right-click  on  a  job  and  select  “Remove”. 

A.4.3  Adding  jobs  with  dependencies 

To  add  a  job  that  depends  on  other  jobs,  you  must  first  select  the  jobs  on  which  the  new  job  will 
depend.  To  select  a  job,  click  the  middle  mouse  button  on  it.  If  you  don’t  have  a  middle  mouse 
button,  try  clicking  both  buttons  at  the  same  time.  Your  X  server  may  be  configured  to  recognize 
that  combination  as  a  middle  click.  Selected  jobs  will  show  a  red  box  around  them.  Once  you  have 
all  the  desired  jobs  selected,  use  the  left  button  to  add  a  job.  It  will  depend  on  the  selected  jobs. 

A.4.4  Random  job  addition  mode 

You  may  enter  Random  job  addition  mode  by  selecting  “Random  Addition  Mode”  from  the  Jobs 
menu.  Alternatively,  you  may  simply  press  R  to  toggle  between  Random  Addition  Mode  and  User 
Control  Only.  When  in  Random  Addition  Mode,  jobs  will  appear  on  the  map  window  as  defined 
by  the  random  distributions  (see  Setting  options  below) 

A.4.5  Setting  options 

The  options  may  be  found  in  the  Option  menu. 

•  “Allocation  Strategy.”  allows  you  to  select  one  of  the  available  allocation  modules.  This  may 
be  done  at  any  time,  even  when  the  simulation  is  running. 

•  “Random  Task  Parameters”  brings  up  a  dialog  that  allows  you  to  adjust  the  distributions 
for  Deadline,  Work  To  Do,  and  Next  Job.  (see  Random  distribution  settings).  The  bottom 
number  in  the  range  should  be  set  in  the  “Min”  spin  box.  The  Range  slider  is  not  the  top 
value  of  the  range,  but  the  difference  between  the  two  values.  So,  for  a  Deadline  range  of 
40-100,  set  Min  to  40,  and  the  Range  slider  to  60.  The  two  checkboxes  allow  you  to  enable 
and  disable  the  Add  random  dependencies  and  homogenous  jobs  options. 

•  “Proficiency  Gain”,  when  enabled,  causes  resources  to  gain  proficiency  when  working  on  a 
task.  See  Resource  proficiency  gain. 
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•  “Show  Textbox”  attaches  a  scrolling  text  box  to  the  bottom  of  the  user  interface,  which  out¬ 
puts  messages  whenever  a  job  is  added,  succeeds,  fails,  or  any  one  of  several  other  events 
occurs.  It  generally  displays  too  much  cryptic  information  to  be  very  useful  except  in  de¬ 
bugging. 


A.4.6  Scenario  files 

Scenario  files  allow  a  sequence  of  added  jobs  and  their  dependencies  to  be  saved  into  a  file  and 
replayed  again.  A  scenario  file  records: 

•  All  jobs  that  have  been  added  since  time  0,  by  any  means,  along  with  their  type,  starting 
time,  deadline,  work  to  do,  location,  and  dependencies  if  any.  Any  jobs  that  are  scheduled  to 
be  added  in  the  future  are  also  added  to  the  scenario  file. 

•  Starting  locations  and  beginning  proficiency  levels  for  all  resources.  If  you  want  the  simula¬ 
tion  to  have  less  resources,  make  a  scenario  file  and  then  edit  it  with  a  text  editor,  removing 
a  few  of  the  Resource  lines. 

To  save  a  scenario  file,  select  “Save  Scenario”  from  the  File  menu.  You  will  be  prompted  for  a 
filename.  To  load  a  scenario,  select  “Open  Scenario”  from  the  File  menu.  The  time  will  be  reset  to 
0,  and  all  information  about  successes,  failures,  past  jobs,  future  jobs,  etc,  will  be  forgotten.  The 
Events  future  number  on  the  status  bar  will  show  a  non-zero  number  if  there  are  any  jobs  in  the 
scenario  file  you  loaded.  If  you  only  want  the  jobs  from  the  scenario  to  be  added,  make  sure  that 
Random  Addition  mode  is  turned  off.  To  restart  the  current  scenario,  choose  “Restart”  from  the 
Action  menu.  To  forget  all  past  and  future  job  information,  and  reset  the  resources,  choose  “New 
Scenario”  from  the  File  menu.  And  finally,  to  forget  all  past  and  future  job  information,  while 
keeping  the  time  and  resources  as  they  are,  choose  “Clear  Events”  from  the  Action  menu. 


A.4.7  Graphing  resource  interaction 

Graphing  resource  interaction  requires  that  Matlab,  version  5  or  6,  be  installed  on  the  computer. 
Matlab  must  also  be  in  the  path,  so  that  the  TaskSim  simulation  can  start  it  in  the  background  with 
the  simple  command  “matlab”.  If  you  have  matlab,  you  can  graph  resource  interaction  over  the 
last  100  time  units  by  clicking  the  “Graph”  button,  or  selecting  Graph  from  the  Action  menu.  You 
may  need  to  wait  for  some  seconds  before  a  Matlab  graph  appears.  The  graph  will  consist  of  three 
parts.  In  the  top  two  sections,  each  job  that  existed  during  the  last  100  time  units  will  show  up  as  a 
certain  color.  That  color  will  have  nothing  to  do  with  the  job’s  type.  The  top  section  of  the  graph 
shows  work  left  against  time  for  all  the  jobs.  The  middle  section  shows  what  percentage  of  the 
available  resources  were  held  by  each  job-  again  over  time.  The  bottom  graph  shows  a  green  and  a 
red  line.  The  green  line  is  the  number  of  successes  so  far,  and  the  red  line  is  the  number  of  failures. 
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A.5  batchsim  -  for  batch  simulations 

batchsim  is  the  tool  to  use  for  series  of  non-interaetive  simulations,  or  bateh  simulations.  It  ean  run 
with  options  on  the  eommand  line,  or  bring  up  a  graphical  user  interface  (GUI)  to  allow  the  user 
to  set  the  options  visually. 

A.5.1  Extra  confi  guration  considerations 

Some  configuration  parameters  need  to  be  set  for  batch  simulations  that  are  not  needed  in  the 
interactive  simulation.  These  include: 

•  The  Random  seed  -  The  pseudo-random  number  generator  needs  to  be  “seeded”.  When  a 
series  of  simulations  is  started,  the  specified  seed  is  planted  in  the  random  number  generator. 
This  means  that  if  you  run  the  same  series  of  simulations  again  with  the  same  seed,  without 
changing  any  other  options,  the  outcome  will  always  be  the  same.  You  can  write  down  the 
seed  and  use  it  again  later  to  always  get  the  same  results.  The  simulator  can  come  up  with  a 
number  for  the  seed  (from  the  Linux  kernel  or  the  system  timer)  if  you  do  not  specify  one. 

•  Runlength  -  the  number  of  time  units  that  each  simulation  in  the  series  should  run. 
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•  Number  of  runs  -  the  number  of  runs  in  the  series.  Eaeh  sueeessive  run  will  be  different. 


A.5.2  The  batch  control  panel 


The  bateh  eontrol  panel  window  will  come  up  if  you  run  batchsim  without  any  options,  or  if  you 
specify  a  “-g”  or  “-batch-gui”  option  among  others.  It  allows  you  to  set  the  above  configuration 
parameters,  along  with  other  options  that  can  be  found  in  the  interactive  simulation.  See  Simulation 
options  for  more  information  about  them.  When  you  have  all  the  options  the  way  you  want  them, 
push  the  “Run”  button.  In  the  “State”  box,  you  can  watch  the  progress  of  the  series.  The  Run 
button  will  remain  depressed  as  long  as  the  series  of  simulations  is  running.  If  it  is  taking  too 
long,  click  the  depressed  Run  button  again  to  cancel  the  run.  The  results  up  to  that  point  will  be 
displayed.  When  you  are  finished  using  the  batch  control  panel,  click  the  Close  button  or  close  the 
window  in  the  usual  way  for  your  window  manager. 

A.5.3  The  command  line  interface 

batchsim  can  be  run  at  the  command  line,  and  all  configuration  options  are  still  available  as  pro¬ 
gram  arguments.  You  can  get  a  summary  of  the  available  options  by  running  “.batchsim  -h”. 

$  . /batchsim  -h 
Simulation  options: 

— module,  -m<f ilename> :  Loads  specified  allocation  module 
— timetorun,  -t<num>:  Runs  simulation  for  <num>  time  units 
— resources,  -r<num>:  Sets  number  of  resources 

— runs,  -x<num> :  Runs  <num>  successive  simulations-  shows  average 
--dist-deadline,  -D<low>-<high> :  Sets  distribution  of  job  deadline  length 
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--dist- jobsize,  -R<low>-<high> :  Sets  distribution  of  job  workload  size 
--dist-next job,  -T<low>-<high> :  Sets  distribution  of  time  until  next  job 
— res-speed,  -z<num>:  Sets  resource  motion  speed 

— use-seed,  -s<num>:  Sets  random  seed-  useful  for  duplicating  runs 
— dependencies,  -d:  Turns  on  random  dependencies 
— proficiencies,  -p :  Turns  on  resource  proficiency  gain 
--homogenous,  -1:  Makes  all  random  jobs  the  same  type 
--verbose,  -v:  Puts  diagnostic  output  to  stderr  (noisy) 

— show-info,  — show-values,  -i:  Displays  values  of  many  useful  parameters 
--batch-gui,  -g:  Run  with  the  batch  simulation  control  panel  (GUI) 

--help,  -h :  Displays  this  help  text 

When  started  with  no  recognized  options,  the  program  attempts  to  open 
the  batch  simulation  control  panel. 

$ 

Any  of  the  above  options  are  available  when  running  the  batch  control  panel-  simply  add  a  -g  or 
-batch-gui  to  the  options.  Since  the  options  are  mostly  explained  above,  an  exhaustive  explanation 
will  not  be  undertaken  here.  It  is  worth  noting  that  the  -i  option  (-show-info,  -show-values)  is  a 
very  valuable  one.  It  shows  what  random  seed  is  being  used,  all  the  random  distribution  parameters, 
and  the  state  of  nearly  all  the  other  options. 

To  give  a  feel  for  the  way  these  options  work,  here  are  some  example  runs: 

$  ./batchsim  -x3  -i 
TaskSim  Batch  Simulator 

Version  2.3.7  (against  libpaul  0.0. 6sg) 
brought  to  you  by  paul  cannon  2001 
space  software  lab/utah  state  university 

Random  seed:  870897571 
Runlength:  300 
Number  of  runs:  3 
Number  of  resources:  30 
Resource  move  speed:  3.00 
Allocation  module:  gensched.so 
Deadline  distribution:  40-100 
Jobsize  distribution:  70-600 
Next  job  distribution:  1-10 
Number  of  random  job  types:  3 
Random  dependencies:  off 
Resource  proficiency  gain:  off 
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Run 


Succ 


Fail 


Perc 


1 

51 

1 

0.98 

2 

49 

1 

0.98 

3 

49 

0 

1.00 

Total 

149 

2 

0 . 99 

$  ./batchsim  -x3  -i  --use-seed=870897571  --module  democratic 
TaskSim  Batch  Simulator 

Version  2.3.7  (against  libpaul  0.0. 6sg) 
brought  to  you  by  paul  cannon  2001 
space  software  lab/utah  state  university 

Random  seed:  870897571 
Runlength:  300 
Number  of  runs:  3 
Number  of  resources:  30 
Resource  move  speed:  3.00 
Allocation  module:  democrat ic . so 
Deadline  distribution:  40-100 
Jobsize  distribution:  70-600 
Next  job  distribution:  1-10 
Number  of  random  job  types:  3 
Random  dependencies:  off 
Resource  proficiency  gain:  off 


Run 

Succ 

Fail 

Perc 

1 

6 

41 

0.13 

2 

9 

36 

0.20 

3 

8 

38 

0 . 17 

Total 

23 

115 

0 . 17 

$  ./batchsim  - 

x=3  — module=crisis  -p  — res 

TaskSim 

Batch 

Simulator 

Version 

2.3.7 

(against 

libpaul  0 . 0 . 6sg) 

brought 

to  you 

by  paul 

cannon  2001 

space  software 

lab/utah 

state  university 

Run 

Succ 

Fail 

Perc 
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1 

52 

0 

1.00 

2 

58 

0 

1.00 

3 

53 

0 

1.00 

— 

— 

— 

— 

Total 

163 

0 

1 .00 

$  ./batchsim  - 

TaskSim  Batch 

-dist-deadline  10-20 

Simulator 

Version 

2.3.7 

(against 

libpaul  0 . 0 . 6sg) 

brought 

to  you 

by  paul 

cannon  2001 

space  software 

lab/ Utah 

state  university 

Run 

Succ 

Fail 

Perc 

— 

— 

— 

— 

1 

14 

49 

0.22 
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